Building versus buying: the enterprise approach that actually works

The boardroom was divided. Half the executives wanted to build our own AI platform — "It's our competitive advantage!" they argued. The other half wanted to buy from established vendors — "Why reinvent the wheel?" they countered. Sound familiar?

This debate plays out in enterprises worldwide, but here's what I've learned: the companies that win don't choose sides. They blend both approaches strategically, creating what I call a "hybrid AI architecture" that leverages the best of both worlds. The manuscripts I've studied reinforce this — success comes not from rigid adherence to build or buy, but from thoughtful integration that amplifies human capabilities.

A growing consensus points to an 80/20 rule: 80% of AI needs are met by purchased or subscription-based AI solutions. 20% are addressed with custom-built AI where deep integration or unique IP is critical. But even this rule needs context. The real art lies in knowing which 20% to build and how to make that custom development count.

Should we build AI in-house or buy from vendors/partners?

The build versus buy decision starts with a fundamental question: what's your competitive edge? If AI will be a pillar of your strategy, owning your technology stack can offer long-term advantages—greater customization, the ability to refine models over time, and data ownership.

But ownership isn't always advantage. Even companies like Meta and Microsoft faced spectacular failures with M (Facebook's virtual assistant started in 2015 and discontinued in 2018) and Tay. These failures remind us that building AI isn't just technically challenging — it's organizationally complex.

The manuscript on human-AI collaboration offers a nuanced perspective: the question isn't whether to build or buy, but how to create systems where humans and AI elevate each other. This might mean buying foundational models but building custom interfaces. Or purchasing analytics platforms but developing proprietary algorithms for your specific domain.

Custom AI solutions typically range from $100,000 to $500,000+ for enterprise-grade implementations, whereas off-the-shelf AI platforms often start with seemingly affordable monthly subscriptions of $200-$400. But these headline figures hide the true costs. Built systems require ongoing maintenance, talent retention, and infrastructure scaling. Bought systems might have hidden costs in customization, integration, and vendor lock-in.

The most successful strategies recognize that build versus buy isn't a one-time decision but an evolving portfolio. Start with bought solutions to learn quickly, identify where custom development adds real value, then build selectively where you have genuine differentiation.

Which cloud/hardware stack best supports enterprise-grade GenAI?

Choosing the right infrastructure for GenAI is like selecting the foundation for a skyscraper — get it wrong, and everything built on top becomes unstable. MLOps platforms offer an end-to-end, unified environment that connects all these components in a scalable and automated workflow.

The major cloud providers — AWS, Azure, and Google Cloud — each offer comprehensive AI stacks. But "comprehensive" doesn't mean "best for you." AWS excels in breadth and maturity, Azure integrates seamlessly with enterprise Microsoft environments, and Google Cloud offers cutting-edge AI research translated into products. The choice depends on your existing infrastructure, team expertise, and specific use cases.

But cloud isn't the only consideration. Edge computing will take center stage, as industries from healthcare to retail deploy localized AI solutions. This means thinking beyond centralized cloud to distributed architectures that process data where it's generated.

Hardware acceleration through GPUs remains critical for training large models, but inference increasingly runs on specialized chips. The manuscript on AI operations emphasizes that hardware choices should align with workload characteristics. Training might require high-end GPUs in the cloud, while inference might run on edge devices or specialized accelerators.

The winning strategy often involves multiple infrastructure providers. Use cloud for elastic training workloads, on-premise for sensitive data processing, and edge for real-time inference. This multi-cloud, hybrid approach provides flexibility and avoids vendor lock-in.

How do we choose between open-source and proprietary AI models?

The open-source versus proprietary debate in AI has unique characteristics. Incumbents have always benefited from established trust and existing distribution, but in the AI era, they're increasingly outperformed by AI-native competitors from a product quality and velocity perspective.

Open-source models offer compelling advantages: no licensing costs, community innovation, transparency for debugging, freedom from vendor lock-in, and ability to customize deeply. But they also come with responsibilities: security validation, ongoing maintenance, integration complexity, and need for specialized expertise.

Proprietary models provide different benefits: enterprise support, regular updates, compliance certifications, integrated tooling, and clear accountability. Yet they bring constraints: licensing costs, limited customization, vendor dependence, and potential obsolescence.

The manuscript's perspective on "hybrid intelligence" applies here too. Many organizations use open-source models for experimentation and non-critical applications while deploying proprietary solutions for mission-critical uses. Or they might use proprietary models as a baseline but fine-tune open-source alternatives for specific domains.

In 2023, the average spend across foundation model APIs, self-hosting, and fine-tuning models was $7M across the dozens of companies we spoke to. This spending often splits between open and proprietary solutions, creating a portfolio approach that balances risk and innovation.

When should we sunset legacy automation in favour of AI?

The question of when to replace legacy automation with AI is less about technology and more about value creation. Legacy systems, despite their limitations, often embed years of business logic and handle edge cases that new AI systems might miss.

The decision to build or buy gen AI tools is often portrayed as binary, but Wayfair and Expedia illustrate the advantages of a hybrid approach. This same principle applies to legacy replacement — it's rarely all-or-nothing.

The manuscript on AI operations suggests a "wrapper" approach where AI augments rather than replaces legacy systems initially. This might mean: AI preprocessing data before it enters legacy systems, AI post-processing legacy system outputs, AI handling new use cases while legacy handles established ones, or gradual migration of functions from legacy to AI.

Key indicators that it's time to sunset legacy automation include: maintenance costs exceeding value delivered, inability to handle new data types or volumes, lack of expertise to maintain systems, and availability of proven AI alternatives. But timing matters. Move too early, and you risk disrupting stable operations. Move too late, and competitors gain advantage.

The most successful transitions treat legacy sunset as a journey, not an event. They maintain parallel systems during transition, thoroughly test AI replacements, preserve critical business logic, and plan for rollback if needed.

What interoperability standards are emerging for enterprise AI stacks?

Interoperability in AI stacks remains an evolving challenge. Unlike traditional software with established protocols, AI systems often use proprietary formats and interfaces. But standards are emerging, driven by enterprise demand for flexibility.

MLOps tools are single-purpose solutions that address specific stages, such as model training, monitoring, or data versioning. The push for interoperability aims to let these tools work together seamlessly.

Key emerging standards include: ONNX for model interoperability, MLflow for experiment tracking and model registry, Kubeflow for orchestration, and OpenAPI for AI service interfaces. These standards enable what the manuscript calls "composable AI architecture" — mixing and matching components from different vendors.

But standards alone don't guarantee interoperability. Real-world integration requires: common data formats and schemas, consistent authentication and authorization, shared monitoring and logging approaches, and unified governance frameworks. Organizations leading in AI create their own interoperability layers, building abstractions that hide vendor-specific details.

The future points toward AI platforms that are both integrated and open. For the moment, it is clearly easier to adopt AI through application vendors than trying to build your own platform given that the market for enterprise platform tools is still very, very fragmented. This fragmentation drives demand for standards that enable best-of-breed architectures.

Sources

Medium - Maor Ezer. (2025). "Enterprise AI in 2025: An 80/20 Balance of Buy vs. Build."

Andreessen Horowitz. (2025). "How 100 Enterprise CIOs Are Building and Buying Gen AI in 2025."

Netguru. (2025). "Build vs Buy AI: Which Choice Saves You Money in 2025?"

HP Tech Takes. (2025). "Enterprise AI Services: Build vs. Buy Decision Framework."

VentureBeat. (2025). "Build or buy? Scaling your enterprise gen AI pipeline in 2025."

EnterpriseBot. (2024). "Enterprise Generative AI: Build vs. Buy."

Capella Solutions. (2025). "Building vs Buying AI Solutions: A Decision Framework for Enterprise Leaders."

Astera. (2023). "Build vs Buy: How it applies to Enterprise Software."

TechCrunch. (2024). "From AI agents to enterprise budgets, 20 VCs share their predictions on enterprise tech in 2025."

Andreessen Horowitz. (2024). "16 Changes to the Way Enterprises Are Building and Buying Generative AI."

DigitalOcean. (2025). "10 MLOps Platforms to Streamline Your AI Deployment in 2025."

Neptune.ai. (2025). "MLOps Landscape in 2025: Top Tools and Platforms."