Taishi Nakamura

Taishi Nakamura

Google Scholar

Email : taishi [at] rio (dot) scrc (dot) iir (dot) isct (dot) ac (dot) jp

Hi! 👋 I'm Taishi, a second-year Master's student in the Department of Computer Science at Institute of Science Tokyo, advised by Professor Rio Yokota.

My research focuses on scaling foundation models efficiently and optimizing their performance. I've been working on continual pre-training methods for large language models, mixture of experts architectures, and test-time compute approaches to enhance reasoning capabilities.

I'm passionate about open source and have been actively contributing to open source projects. In building models with strong Japanese language capabilities from scratch, I led pre-training and ablation experiments for LLM-jp ver2.0 Models and spearheaded the development of flagship LLM-jp-3 MoE under the guidance of Professor Jun Suzuki. In enhancing Japanese language models through continual pre-training, I contributed to Swallow LLM through model evaluation, dataset construction, establishing continual pre-training methodologies, and model release. For open multilingual model development, I led the training and release of the Aurora-M model in collaboration with Ontocord.ai and Professor Sampo Pyysalo.

Prior to my current research, our team was selected for the prestigious MITOU TARGET program for Quantum Computing under Dr. Yuuki Tokunaga's mentorship. We developed an educational platform for quantum computing that is now publicly available as the Qualsimu Textbook.

Education

Apr 2024 - Present   Master of Science in Computer Science, Institute of Science Tokyo (Formerly Tokyo Institute of Technology)

Apr 2021 - Mar 2024   Bachelor of Science in Computer Science, Tokyo Institute of Technology

Experience

May 2024 - Present   Research Assistant, National Institute of Informatics Research and Development Center for Large Language Models

Feb 2024 - Present   Research Intern, Sakana AI

Aug 2023 - Present   Student Trainee, National Institute of Advanced Industrial Science and Technology

Apr 2023 - Present   Research Assistant, Institute of Science Tokyo (Formerly Tokyo Institute of Technology)

Oct 2023 - Apr 2024   Research Intern, LLM-JP

Jun 2023 - Feb 2024   MITOU Target Program, Information-technology Promotion Agency, Japan

Publications

International conferences

Kou Misaki, Yuichi Inoue, Yuki Imajuku, So Kuroki, Taishi Nakamura, Takuya Akiba. Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search. International Conference on Learning Representations (ICLR) Workshop on Foundation Models in the Wild, 2025.

Taishi Nakamura, Takuya Akiba, Kazuki Fujii, Yusuke Oda, Rio Yokota, Jun Suzuki. Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization. International Conference on Learning Representations (ICLR), 2025. [Hugging Face][Dataset][Code][Log]

So Kuroki, Taishi Nakamura, Takuya Akiba, Yujin Tang. Agent Skill Acquisition for Large Language Models via CycleQD. International Conference on Learning Representations (ICLR), 2025. [Hugging Face][Code][Blog]

Taishi Nakamura*, Mayank Mishra*, Simone Tedeschi*, ..., Matthew Blumberg, #Victor May, #Huu Nguyen, #Sampo Pyysalo (49 authors). Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code. International Conference on Computational Linguistics (COLING) Industry Track, 2025. *Equal contribution #Equal mentoring. [Hugging Face][Code][Blog]

Kazuki Fujii, Taishi Nakamura, Rio Yokota. llm-recipes: A Framework for Seamless Integration and Efficient Continual Pre-Training of Large Language Models. The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC) Trillion Parameter Consortium Workshop, 2024. [Slide] [Code]

Kazuki Fujii*, Taishi Nakamura*, Mengsay Loem, Hiroki Iida, Masanari Ohi, Kakeru Hattori, Hirai Shota, Sakae Mizuki, Rio Yokota, Naoaki Okazaki. Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities. Conference on Language Modeling (COLM), 2024. *Equal contribution [Hugging Face][Code]

Naoaki Okazaki, Kakeru Hattori, Hirai Shota, Hiroki Iida, Masanari Ohi, Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Rio Yokota, Sakae Mizuki. Building a Large Japanese Web Corpus for Large Language Models. Conference on Language Modeling (COLM), 2024.

Preprints

Youmi Ma, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Koki Maeda, Kakeru Hattori, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki. Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models. arXiv:2503.23714 [cs.CL], 2025.

Koshiro Saito, Sakae Mizuki, Masanari Ohi, Taishi Nakamura, Taihei Shiotani, Koki Maeda, Youmi Ma, Kakeru Hattori, Kazuki Fujii, Takumi Okamoto, Shigeki Ishida, Hiroya Takamura, Rio Yokota, Naoaki Okazaki. Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs. arXiv:2412.14471 [cs.CL], 2024.

Kazuki Fujii, Taishi Nakamura, Rio Yokota. Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs. arXiv:2411.08719 [cs.LG], 2024.

LLM-jp: Akiko Aizawa, Eiji Aramaki, ..., Taishi Nakamura, ..., Koichiro Yoshino (79 authors). LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs. arXiv:2407.03963 [cs.CL], 2024. Authors are listed in alphabetical order. [Hugging Face][Dataset][Code]


Last updated: April 2025