LLAMA.cpp SYCL
Summary
Developed SYCL/DPCPP backend for llama.cpp referred from CUDA, achieving >10x performance gains on Intel GPU(Max, Flex, Arc) compared with OpenCL implementation. Co-work with the community owners and response to issues related to Intel. Published blog: Run LLMs on Intel GPUs Using llama.cpp https://medium.com/intel-analytics-software/run-llm-on-intel-gpus-using-llama-cpp-579d3595954e