Oracle’s Vector Datatype

At Oracle Cloud World 2023, Oracle announced they were moving toward enabling Artificial Intelligence (AI) within many of their products. Oracle is making huge steps forward for many people to use AI daily. As 2023 ended, many other industry leaders announced they would do the same.

Regarding databases, Oracle is the only industry leader that leverages its core product for many different things. For at least a decade, Oracle has turned the Oracle Database into a Swiss army knife by enabling it to support different modern data types, analytics, and development paradigms, all in one product. It is only natural that with the AI revolution starting, Oracle would build a data type that enables organizations to use Retrieval-Augmented Generation (RAG) within the databases.

By adding a “vector” datatype, Oracle simplifies data architectures and the building of RAG or Private-LLM configurations for organizations.

Where is the Vector datatype?

If you use an Oracle Database today, you will not immediately have access to the Vector datatype. Even if you use the latest version, 23.3.x.x, on Oracle Cloud Infrastructure (OCI), you cannot access this datatype (Believe me, I tried). You have to be part of the beta program for the next release of Oracle Database, which will provide you details on the Vector datatype before the initial release in 23.4.

In short, and for the moment, if you are not part of the beta program, this datatype will be available soon!

What is the Vector datatype?

The Vector datatype is a modern datatype designed to efficiently store, manage, and index massive amounts of high-dimensional data. This data type is growing in interest and is used to create additional value for generative AI use cases and applications.

Vector Settings?

The vector datatype is used within standard Oracle tables. This enables database schemas to use the data in real-time. The following command shows a simple example:

sql> CREATE TABLE rd_vectors (id NUMBER, embed VECTOR);

This simple example shows that the vector datatype can be set as a column within a table. Enabling it this way allows you to specify vectors of different dimensions with different formats. Think of this as a catch-all setting for vector data.

It is great to have a catch-all; however, you can limit the type of vectors created by imposing constraints on the stored data. In this example, you can only store up to 1024 dimensions, and they must be formatted as INT8 (8-bit integers):

sql> CREATE TABLE rd_vectors_int8 (id NUMBER, embed VECTOR(1024, INT8);

With this complex example, you must have 1024 dimensions, each of which must be 8-bit integers (INT8). The number of dimensions should be greater than 0 with no limit. The dimensions formats are INT8, FLOAT32, and FLOAT64. FLOAT32 and FLOAT64 are the IEEE standards, and the Oracle Databases will automatically cast the values as needed.

Examples of setting additional dimension formats are:

sql> CREATE TABLE rd_vectors_int8 (id NUMBER, embed VECTOR(1024, FLOAT32);


sql> CREATE TABLE rd_vectors_int8 (id NUMBER, embed VECTOR(1024, FLOAT64);

Vector Forms?

With the understanding of Vector settings, there are a few forms that a vector can take. Understanding these forms will help in defining the proper vector for your requirements:

vector_settings

Important Note: A vector can be NULL, but the dimensions cannot be NULL. (example: You cannot have [(1.1, NULL, 2.3)]

Examples of Vectors:

Now that you understand the Vector datatype, how does the Oracle Database see the datatype? The following SQL example shows that the table rd_vector is created with only vector datatypes using different variations.

sql> CREATE TABLE vector.rd_vector (
v1 VECTOR,
v2 VECTOR(3, FLOAT32),
v3 VECTOR(2, FLOAT64),
v4 VECTOR(1, INT8),
v5 VECTOR(1, *),
v6 VECTOR(*, FLOAT32),
v7 VECTOR(*, *)
);

sql> desc vector.rd_vector;

Name                                         Null?   Type
----------------------------------------- -------- ——————————————
V1                                                         VECTOR(*, *) 
V2                                                         VECTOR(3, FLOAT32)
V3                                                         VECTOR(2, FLOAT64)
V4                                                         VECTOR(1, INT8)
V5                                                         VECTOR(1, *)
V6                                                         VECTOR(*, FLOAT32)
V7                                                         VECTOR(*, *)

Hopefully, Oracle will release Oracle Database 23.4 soon! There will be many opportunities to use vector data types as organizations expand their usage of Generative AI.

Enjoy!

Please follow and like:
Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Enquire now

Give us a call or fill in the form below and we will contact you. We endeavor to answer all inquiries within 24 hours on business days.