Transformation·

PostQode’s AI Odyssey: Part 2 - The Transformation

In the first part of our journey, I shared how we discovered the transformative potential of AI for PostQode and decided to pivot our strategy towards integrating AI deeply into our platform. Now, let’s dive into the technical choices and steps we took to bring this vision to life.

Implementing the RAG System

We decided to implement a Retrieval-Augmented Generation (RAG) system to enhance PostQode’s capabilities. Our goal was to build a comprehensive project context before generating test cases or automating scripts. We believed that understanding the project context would enable us to create more accurate and effective test cases.

To achieve this, we started by developing multiple connectors, despite the presence of LlamaIndex, to ensure we could tailor our solution to our needs. We experimented with various vector databases, initially using Vespa DB for its hybrid search capabilities, which combines both vector and semantic search. We then explored Azure AI Search, which promised a more manageable setup without the need to invest time in cluster management. This switch allowed us to focus more on our core development tasks.

Experiments with LLMs and Agent Frameworks

For the RAG system, we started with GPT-3.5 and explored other open-source models like Llama 2. Llama 2 provided decent outputs, but when it came to building an agent framework, we faced some challenges. We experimented with frameworks like AutoGen and CrewAI, but we needed more control over the process. This led us to try LangGraph, which proved to be the right fit for our needs.

Challenges with LangChain and Adaptation

While LangChain was initially promising, we encountered compatibility issues with multiple open-source models. LangChain’s initial libraries were mainly compatible with OpenAI APIs, but switching to other models like Cloude/GemeniAI required significant effort of change code and AgentExecutor was not reliable on tool calling. This challenge made us reconsider whether to continue with LangChain or develop a completely new agent framework. Fortunately, the release of LangChain 0.2, with its enhanced abstraction and control, allowed us to restructure our workflows efficiently and resolve compatibility issues.

Cost Management and Model Selection

Cost management became a critical concern as we progressed. We primarily used GPT-4 for tasks requiring high reasoning capabilities and GPT-3.5 for others specific tasks, such as script generation, we considered fine-tuning models. Initially, we thought of using open-source models to manage costs, but due to time constraints and the need for reliable performance, we decided to fine-tune GPT-3.5 instead. This decision proved beneficial, providing the results we needed without significant overhead.

Recent GPT-4o gave us little breathing space on costing.

Still some more time needed open source model to catchup GPT-4/4o level reasoning, with sufficient context window. We have not stopped our hunting on alternative to GPT-4/4o.

Current Opportunities and Future Plans

Our current opportunities include improving test coverage, reducing costs, and minimizing processing time. We are committed to refining our RAG system and exploring open-source models for long-term sustainability. However, at present, GPT-4o meets our needs effectively. We plan to complete our use cases with GPT-4o and then explore optimization opportunities to better manage costs.

Additionally, we aim to enhance the depth and breadth of our test coverage, reduce operational costs, and decrease the time required for processing. We are also focused on fine-tuning our RAG system to ensure it meets our evolving requirements.

Stay tuned as we continue to push the boundaries of what’s possible with PostQode and AI. Our journey is far from over, and we are excited to see where it leads us next.

This concludes our two-part series on PostQode’s AI Odyssey. We hope you enjoyed learning about our transformation and the exciting technical choices that have shaped our platform.

P.S: This not any technical recommendation/suggestion of tech stacks, we just want to share our journey and options we took.