Microsoft updates Bing search with new language models for faster and more accurate results
Microsoft has updated Bing's search technology by integrating large and small language models, aiming to improve speed and accuracy while reducing costs. The new models are designed to handle complex queries more efficiently. The update includes the use of NVIDIA's TensorRT-LLM, which has significantly improved Bing's "Deep Search" feature. Latency has decreased from 4.76 seconds to 3.03 seconds per batch, and throughput has increased from 4.2 to 6.6 queries per second. These changes promise faster search results and better accuracy for users. The enhancements allow Bing to provide more relevant results while maintaining quality, setting a new standard for search technology.