Taishi Nakamura

I love training large-scale models and am interested in scalable neural network architectures. I am also fascinated by the computing power that enables this, including distributed computing and the next generation of computing, quantum computers. I am passionate about building multimodal systems, advanced reasoning, and creating agents that continually evolve.

Education

Tokyo Institute of Technology

Apr 2024 - Present Tokyo Institute of Technology

Master of Science in Computer Science

Apr 2021 - Mar2024 Tokyo Institute of Technology

Bachelor of Science in Computer Science

Graduated one year early due to outstanding academic performance

Research experience

Feb 2024 - Present Sakana AI

Research Internship

Sep 2023 - Present TokyoTech-LLM

Focus on continual pretraining from strong pretrained models such as the Llama-2 family to develop a strong Japanese model

The published model can be found here.

The project link can be found here.

Jun 2023 - Present GPT-Fugaku

Training Japanese Large Language Models Utilizing Extensive Scale CPUs on Fugaku

Jun 2023 - Present LLM-JP

Active as a member of the model construction team.

The published model can be found here.

Apr 2023 - Present YOKOTA Laboratory Tokyo Institute of Technology

I am truly fortunate to be conducting research with Professor Rio Yokota as my mentor.

Apr 2022 - Aug 2023 A*Quantum

Research Internship

Industrial Experience

Jun 2023 - Feb 2024 MITOU TARGET

The project link can be found here.

The deliverables will be published shortly.

Feb 2022 - Dec 2022 Crystal Method

Engineering Internship

Open Source Projects

Oct 2023 - Present Hummingbird

We will be training models with multimodality.

Jun 2023 - Present MDEL

I have trained the model and made it publicly available on Hugging Face.

The published model can be found here.

Preprints

Under review, 2024 Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities

Kazuki Fujii∗, Taishi Nakamura∗, Mengsay Loem, Hiroki Iida, Masanari Ohi, Kakeru Hattori, Hirai Shota, Sakae Mizuki, Rio Yokota, Naoaki Okazaki

Under review, 2024 Building a Large Japanese Web Corpus for Large Language Models

Naoaki Okazaki, Kakeru Hattori, Hirai Shota, Hiroki Iida, Masanari Ohi, Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Rio Yokota, Sakae Mizuki

Under review, 2024 Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo

Domestic Conference in Japan

NLP 2024 継続事前学習による日本語に強い大規模言語モデルの構築

藤井一喜∗, 中村泰士∗, Mengsay Loem, 飯田大貴, 大井聖也, 服部翔, 平井翔太, 水木栄, 横田理央, 岡崎直観

NLP 2024 大規模言語モデルの日本語能力の効率的な強化: 継続事前学習における語彙拡張と対訳コーパスの活用

水木栄, 飯田大貴, 藤井一喜, 中村泰士, Mengsay Loem, 大井聖也, 服部翔, 平井翔太, 横田理央, 岡崎直観

NLP 2024 Swallowコーパス: 日本語大規模ウェブコーパス

岡崎直観, 服部翔, 平井翔太, 飯田大貴, 大井聖也, 藤井一喜, 中村泰士, Mengsay Loem, 横田理央, 水木栄

Contact

Last updated: May 2024