Taishi Nakamura

I love training large-scale models and am interested in scalable neural network architectures. I am also fascinated by the computing power that enables this, including distributed computing and the next generation of computing, quantum computers. I am passionate about building multimodal systems, advanced reasoning, and creating agents that continually evolve.

Education

Tokyo Institute of Technology

Apr 2024 - Present   Tokyo Institute of Technology

Tokyo Tech logo

Master of Science in Computer Science

Apr 2021 - Mar2024   Tokyo Institute of Technology

Tokyo Tech logo

Bachelor of Science in Computer Science

Graduated one year early due to outstanding academic performance

Research experience

Feb 2024 - Present   Sakana AI

Sakana AI logo

Research Internship

Sep 2023 - Present   TokyoTech-LLM

TokyoTech-LLM logo

Focus on continual pretraining from strong pretrained models such as the Llama-2 family to develop a strong Japanese model

The published model can be found here.

The project link can be found here.

Jun 2023 - Present   GPT-Fugaku

GPT-Fugaku logo

Training Japanese Large Language Models Utilizing Extensive Scale CPUs on Fugaku

Jun 2023 - Present   LLM-JP

llm-jp logo

Active as a member of the model construction team.

The published model can be found here.

Apr 2023 - Present   YOKOTA Laboratory Tokyo Institute of Technology

YOKOTA Laboratory logo

I am truly fortunate to be conducting research with Professor Rio Yokota as my mentor.

Apr 2022 - Aug 2023   A*Quantum

A*Quantum logo

Research Internship

Industrial Experience

Jun 2023 - Feb 2024   MITOU TARGET

MITOU TARGET logo

The project link can be found here.

The deliverables will be published shortly.

Feb 2022 - Dec 2022   Crystal Method

crystal method logo

Engineering Internship

Open Source Projects

Oct 2023 - Present   Hummingbird

Hummingbird logo

We will be training models with multimodality.

Jun 2023 - Present   MDEL

MDEL logo

I have trained the model and made it publicly available on Hugging Face.

The published model can be found here.

Preprints

Under review, 2024   Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities

Kazuki Fujii∗, Taishi Nakamura∗, Mengsay Loem, Hiroki Iida, Masanari Ohi, Kakeru Hattori, Hirai Shota, Sakae Mizuki, Rio Yokota, Naoaki Okazaki

Under review, 2024   Building a Large Japanese Web Corpus for Large Language Models

Naoaki Okazaki, Kakeru Hattori, Hirai Shota, Hiroki Iida, Masanari Ohi, Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Rio Yokota, Sakae Mizuki

Under review, 2024   Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo

Domestic Conference in Japan

NLP 2024   継続事前学習による日本語に強い大規模言語モデルの構築

藤井一喜∗, 中村泰士∗, Mengsay Loem, 飯田大貴, 大井聖也, 服部翔, 平井翔太, 水木栄, 横田理央, 岡崎直観

NLP 2024   大規模言語モデルの日本語能力の効率的な強化: 継続事前学習における語彙拡張と対訳コーパスの活用

水木栄, 飯田大貴, 藤井一喜, 中村泰士, Mengsay Loem, 大井聖也, 服部翔, 平井翔太, 横田理央, 岡崎直観

NLP 2024   Swallowコーパス: 日本語大規模ウェブコーパス

岡崎直観, 服部翔, 平井翔太, 飯田大貴, 大井聖也, 藤井一喜, 中村泰士, Mengsay Loem, 横田理央, 水木栄

Contact


Last updated: May 2024