Boosted Lead Conversion Accuracy from 50% to 87% for a Global Shipping & Logistics Company (Machine Learning Case Study) Prescience Decision Solutions March 12, 2024

Boosted Lead Conversion Accuracy from 50% to 87% for a Global Shipping & Logistics Company (Machine Learning Case Study)

The Fortune 500 company provides transportation, supply chain and logistics services, and related digital solutions. With 700 container vessels, operations in 130 countries and over 7 million square feet of warehouse capacity in around 450 sites, it is one of the world’s largest companies in this sector.


The company leverages data from different systems for lead management and conversion. Salesforce is the customer relationship management tool. The company also has customer specific data for various market segments and verticals. They source lead data from a 3rd party which specializes in the supply chain intelligence market. Additionally, the company utilizes important market indexes like the Global Supply Chain Pressure Index (GSCPI), which is published by the Federal Reserve Bank of New York.

The company uses a 5-stage lead conversion process from Identification to Closure.

In a sales cycle, the likelihood of conversion increases, as a lead moves from one stage to the next. However, the existing system did not factor in the real-world stage wise details of each of these opportunities. Instead, each stage was assigned a fixed conversion probability. Every lead in a particular stage was automatically assigned the same probability, irrespective of the internal and external factors which could influence their chances of winning.

Using these fixed stage wise conversion probabilities presented incorrect insights to the company’s management. While the sales team knew which leads in each stage were of high probability, this was not reflected in the reports which were presented to their higher-ups. Moreover, the sales and revenue forecasts were always much below the actual figures, on account of the current scoring system.

The company needed an Analytics partner to develop an AI-based solution that would replace the existing system and provide an accurate stage-wise scoring of all current leads for each country.


The Prescience Decision Solutions team analyzed the different data sources which were used for the existing lead conversion process. This data, which was stored in different tables on the Azure Data Lake Storage (ADLS Gen 2) platform, was merged into a consolidated Azure Databricks cluster.

The company’s users had not assessed whether all the data elements were statistically significant and if certain variables should be removed to improve accuracy. The Prescience team checked variation inflation factors to identify multi-collinearity and performed statistical tests to understand the significance of the input variables to the output. Based on the results of these Information Value (IV) checks, our team recommended the removal of certain variables.

Next, different ML models were evaluated for the various stages of the lead conversion process. This included the Naïve Bayes algorithm, Support Vector Machine (SVM) algorithm, Logistic Regression algorithm, eXtreme Gradient Boosting (XGBoost) algorithm, Gradient Boosting algorithm and so on. Also, the historical data for the past 4 years reflected the abnormally high sales and revenue figures, on account of the Covid-related surges in demand for the transportation and logistics sector. Hence, k-fold cross validation was performed to reduce training bias and a new Period tag was introduced, to identify whether the leads were from the Pre-Covid, Covid, Post-Covid or Current timeframe.

Finally, the team analyzed vast volumes of chat data for the Negotiations stage. The t-distributed Stochastic Neighbor Embedding (t-SNE) algorithm merged these embedded text-based chats across time periods into a single pipeline to train the selected models. This increased the accuracy of predictions for the Negotiations stage by an additional 2%.

The different technologies used for this engagement included,

  1. Azure Databricks
  2. Python
  3. PySpark

With the new Machine Learning based lead conversion program, the company transitioned from its fixed probability pipeline to one that correctly measured the probability of winning at each stage. Prior to the implementation of the new system, the conversion probability of all opportunities in the Negotiations stage was 50%. After implementing this solution, the accuracy of the lead conversion of all opportunities in the Negotiations stage jumped to 87%.

Not only did this provide the management and the sales teams with clear-cut insights, but the revenue forecasts for the upcoming quarters and financial year were also massively improve.

Learn from the Success Stories of Business Transformations Fueled by Data, AI and Machine Learning