I love training large-scale models and am interested in scalable neural network architectures. I am also fascinated by the computing power that enables this, including distributed computing and the next generation of computing, quantum computers. I am passionate about building multimodal systems, advanced reasoning, and creating agents that continually evolve.
Master of Science in Computer Science
Bachelor of Science in Computer Science
Graduated one year early due to outstanding academic performance
Research Internship
Training Japanese Large Language Models Utilizing Extensive Scale CPUs on Fugaku
Active as a member of the model construction team.
The published model can be found here.
I am truly fortunate to be conducting research with Professor Rio Yokota as my mentor.
Research Internship
The project link can be found here.
The deliverables will be published shortly.
Engineering Internship
We will be training models with multimodality.
I have trained the model and made it publicly available on Hugging Face.
The published model can be found here.
Kazuki Fujii∗, Taishi Nakamura∗, Mengsay Loem, Hiroki Iida, Masanari Ohi, Kakeru Hattori, Hirai Shota, Sakae Mizuki, Rio Yokota, Naoaki Okazaki
Naoaki Okazaki, Kakeru Hattori, Hirai Shota, Hiroki Iida, Masanari Ohi, Kazuki Fujii, Taishi Nakamura, Mengsay Loem, Rio Yokota, Sakae Mizuki
Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak, Aleksandr Drozd, Jordan Clive, Kshitij Gupta, Liangyu Chen, Qi Sun, Ken Tsui, Noah Persaud, Nour Fahmy, Tianlong Chen, Mohit Bansal, Nicolo Monti, Tai Dang, Ziyang Luo, Tien-Tung Bui, Roberto Navigli, Virendra Mehta, Matthew Blumberg, Victor May, Huu Nguyen, Sampo Pyysalo