text-embedding-3-large model
and stored in Nile. The code snippets are stored in a separate table in Nile.
And the response to each question is generated by querying the code snippets table using pg_vector extension, and then sending the relevant documents to
OpenAI’s gpt-4o-mini
model. The response is streamed back to the user in real-time.
Because of Nile’s virtual tenant databases, the retrieved code snippets will only be from the tenant the user selected,
and Nile validates that the user has permissions to view this tenant’s data. No risk of accidentally retrieving code that belongs to the wrong tenant.
embedding
column is of type vector(1024)
. Vector type is provided by the pg_vector extension for storing embeddings.
The size of the vector has to match the number of dimensions in the model you use.
The table has tenant_id
column, which makes it tenant aware. By storing embeddings in a tenant-aware table, we can use Nile’s built-in tenant isolation to ensure that
information about PDFs won’t leak between tenants.
We also need somewhere to store the code itself. It doesn’t have to be Postgres - S3 or Github are fine. But Postgres is convenient in our example.
create-next-app
for this:
.env.example
to .env.local
, and update it with your Nile credentials, OpenAI credentials and (if using Google SSO) Nile’s API URL.
npm install
.
src/lib/OrgRepoEmbedder.ts
await embedDirectory(...)'
calls to refer to your repos. It is typical to map each github organization to a Nile tenant and each repo to a project, but you can model this in any way that makes sense to you. Keep in mind that CodeAssist will only use embeddings from the current project and current tenant as context, so make sure each project is interesting enough to discuss.node --experimental-specifier-resolution=node --loader ts-node/esm --no-warnings src/lib/OrgRepoEmbedder.ts
select * from users
in the query editor.
Once you choose a tenant, you can select a project, browse files, and most important - ask our CodeAssist any question about the projects you embedded.