January 28, 2026 • 5 min read

Cleaning, verifying, and validating data: are these steps missing from AI readiness initiatives?

Cleaning, verifying, and validating data: are these steps missing from AI readiness initiatives?

By: Andrew Connelly

I recently attended the AgentForce World Tour event in New York City. My reason for attending was a bit different from most. Rather than focusing solely on demos, I wanted to speak directly with Salesforce partners deeply embedded in the ecosystem to understand how they are preparing customers for Agentic AI.

As I walked the showroom floor, I encountered many vendors with impressive technologies and applications that clearly solve problems Salesforce users face. They deliver value to the market, and many of them would be fantastic partners for YourICP. These vendors and partners represented a wide range of offerings, these vendors represented a diverse range of offerings from AI strategy consulting to scheduling.

Despite the diversity of solutions, there is one shared dependency across nearly all of them: customer and prospect data in Salesforce. When the data is unreliable, every downstream process breaks down.

What happens to the data before the demo?

As I met with partners, I began asking the same question repeatedly to understand how they prepare data before showcasing it inside their applications. In the vast majority of conversations, the answer was simply: “nothing.”

In roughly 10% of cases, the response was “we normalize the data.” However, this normalization typically amounted to basic deduplication and field mapping to align with Salesforce schemas.

While I did not speak with every vendor at the event, none of the partners I met described processes that ensured client Salesforce data, the primary input driving their applications, was clean, verified, or validated in advance. This important step was left up to the customer to do before using their application. Ensuring high-quality input data is a critical step that appears to be missing in many AI readiness initiatives (though not all). Ultimately, hurting the results for both the customer and the vendor.

The impact of dirty data on agentic AI applications

So what effect does inaccurate or incomplete data have on these sophisticated Agentic AI applications? Consider a few representative use cases from the event.

  • Contact center applications: These solutions support client engagement, reporting, analytics, workflow management, and more. They rely on Salesforce data to interact with customers quickly and efficiently. When input data is inaccurate, even a small percentage of the time, downstream reporting, analysis, follow-ups, forecasts, and decision-making suffer. In the worst cases, valuable data is entirely missing.
  • Social media management platforms: These tools power social engagement, analytics, influencer marketing, employee advocacy, personalization, and workflows. While a social profile link may be accurate, the associated contact data often is not. Missing or incorrect details, such as name, email, or company, limit the ability to engage across channels in a truly personalized way, reducing both efficiency and overall value to the client.
  • General sales and marketing applications: This category includes outbound campaigns, inbound programs, nurture workflows, forms, and website engagement tools. Many of these solutions ingest net-new prospect data, which is frequently inaccurate by design as prospects attempt to limit follow-up. When this data is cleaned and verified, however, the likelihood of converting a cold prospect into an engaged lead increases significantly, delivering greater impact for the end client.

These are just a few examples from one event on a cold December day in New York City. While some companies clearly prioritize data accuracy, many do not. This represents a missed opportunity to enhance both the effectiveness of their applications and the outcomes for their customers.

Initial strategies to make data agentic AI–ready

Below are several practical steps organizations can take to improve data readiness:

  • Assess data quality upfront. Run your data (or your client’s data) through a no-cost analysis or cleansing exercise. This quickly reveals data health and helps teams estimate the effort, resources, and potential downstream risk. Addressing 5% bad data is very different from addressing 40%.
  • Mind the gaps. (Yes – I took the train from Boston to NYC.) Identify missing fields that could materially improve your product and customer outcomes. Examples include address, city, state, postal code, country, revenue, employee count, installed technologies, and mobile phone numbers. Depending on your application, these attributes can significantly enhance outputs. There may also be underutilized third-party datasets worth exploring.
  • Define and enforce your Ideal Customer Profile (ICP). Be clear about the types of clients you want to onboard—and do not hesitate to remove data that does not align with your ICP or your customer’s goals. Review and refine your ICP regularly to reflect changes in behavior, attrition, market trends, and macroeconomic conditions.
  • Allow for intelligent flexibility at the edges. While an ICP provides focus, watch for high-intent prospects just outside its boundaries. For example, a company with 975 employees showing rapid growth may still represent a strong opportunity even if your ICP starts at 1,000+ employees.
  • Refresh your data regularly. Many organizations accumulate data quickly but fail to maintain it over time. Relying on a single data source or cleaning data only in project-specific batches leaves large portions of the dataset stale or inaccurate.
  • Continuously evaluate and test. Use A/B testing to understand which data attributes most improve product performance and customer outcomes, and adjust your strategy accordingly.

Let’s keep the conversation going Do you believe that cleaning, verifying, and validating customer and prospect data early in the process can materially improve the success of Agentic AI applications? I’d welcome the discussion. Just send me a note here: AConnelly@youricp.com