RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://medium.com/walmartglobaltech/search-model-serving-using-pytorch-and-torchserve-6caf9d1c5f4d below:

Search Model Serving Using PyTorch and TorchServe | by Pankaj Takawale | Walmart Global Tech Blog

Search Model Serving Using PyTorch and TorchServe 10 min read · Jan 23, 2023 Search Model Serving GPU Metrics

Walmart Search has embarked on the journey of adopting Deep Learning in the search ecosystem to improve search relevance. For our pilot use case, we served the computationally intensive Bert Base model at runtime with an objective to achieve low latency and high throughput.

We built a highly scalable model serving platform to enable fast runtime inferencing using TorchServe for our evolving models. TorchServe provides the flexibility to support multiple executions.

Evolution

One monolithic Search Query Understanding application was responsible for understanding the user’s intent behind the search query. Through a single Java Virtual Machine (JVM)-hosted web application, it loaded and served multiple models. Experimental models were loaded onto the same query understanding application. These models were large, and computation was expensive.

With this approach, we faced the following limitations:

Inability to refresh a model with the latest version or add a new experimental model without full application deployment
Increased memory pressure on single application
Slow startup time
With concurrent model execution, the realized performance gain was minimal due to CPU limitations

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4