At Oracle Cloud World 2023, Oracle announced they were moving toward enabling Artificial Intelligence (AI) within many of their products. Oracle is making huge steps forward for many people to use AI daily. As 2023 ended, many other industry leaders announced they would do the same.
Regarding databases, Oracle is the only industry leader that leverages its core product for many different things. For at least a decade, Oracle has turned the Oracle Database into a Swiss army knife by enabling it to support different modern data types, analytics, and development paradigms, all in one product. It is only natural that with the AI revolution starting, Oracle would build a data type that enables organizations to use Retrieval-Augmented Generation (RAG) within the databases.
By adding a “vector” datatype, Oracle simplifies data architectures and the building of RAG or Private-LLM configurations for organizations.
Where is the Vector datatype?
If you use an Oracle Database today, you will not immediately have access to the Vector datatype. Even if you use the latest version, 23.3.x.x, on Oracle Cloud Infrastructure (OCI), you cannot access this datatype (Believe me, I tried). You have to be part of the beta program for the next release of Oracle Database, which will provide you details on the Vector datatype before the initial release in 23.4.
In short, and for the moment, if you are not part of the beta program, this datatype will be available soon!
What is the Vector datatype?
The Vector datatype is a modern datatype designed to efficiently store, manage, and index massive amounts of high-dimensional data. This data type is growing in interest and is used to create additional value for generative AI use cases and applications.
Vector Settings?
The vector datatype is used within standard Oracle tables. This enables database schemas to use the data in real-time. The following command shows a simple example:
sql> CREATE TABLE rd_vectors (id NUMBER, embed VECTOR);
This simple example shows that the vector datatype can be set as a column within a table. Enabling it this way allows you to specify vectors of different dimensions with different formats. Think of this as a catch-all setting for vector data.
It is great to have a catch-all; however, you can limit the type of vectors created by imposing constraints on the stored data. In this example, you can only store up to 1024 dimensions, and they must be formatted as INT8 (8-bit integers):
sql> CREATE TABLE rd_vectors_int8 (id NUMBER, embed VECTOR(1024, INT8);
With this complex example, you must have 1024 dimensions, each of which must be 8-bit integers (INT8). The number of dimensions should be greater than 0 with no limit. The dimensions formats are INT8, FLOAT32, and FLOAT64. FLOAT32 and FLOAT64 are the IEEE standards, and the Oracle Databases will automatically cast the values as needed.
Examples of setting additional dimension formats are:
sql> CREATE TABLE rd_vectors_int8 (id NUMBER, embed VECTOR(1024, FLOAT32);
sql> CREATE TABLE rd_vectors_int8 (id NUMBER, embed VECTOR(1024, FLOAT64);
Vector Forms?
With the understanding of Vector settings, there are a few forms that a vector can take. Understanding these forms will help in defining the proper vector for your requirements:
Important Note: A vector can be NULL, but the dimensions cannot be NULL. (example: You cannot have [(1.1, NULL, 2.3)]
Examples of Vectors:
Now that you understand the Vector datatype, how does the Oracle Database see the datatype? The following SQL example shows that the table rd_vector is created with only vector datatypes using different variations.
sql> CREATE TABLE vector.rd_vector (
v1 VECTOR,
v2 VECTOR(3, FLOAT32),
v3 VECTOR(2, FLOAT64),
v4 VECTOR(1, INT8),
v5 VECTOR(1, *),
v6 VECTOR(*, FLOAT32),
v7 VECTOR(*, *)
);
sql> desc vector.rd_vector;
Name Null? Type
----------------------------------------- -------- ——————————————
V1 VECTOR(*, *)
V2 VECTOR(3, FLOAT32)
V3 VECTOR(2, FLOAT64)
V4 VECTOR(1, INT8)
V5 VECTOR(1, *)
V6 VECTOR(*, FLOAT32)
V7 VECTOR(*, *)
Hopefully, Oracle will release Oracle Database 23.4 soon! There will be many opportunities to use vector data types as organizations expand their usage of Generative AI.
Enjoy!
Current Oracle Certs
Bobby Curtis
I’m Bobby Curtis and I’m just your normal average guy who has been working in the technology field for awhile (started when I was 18 with the US Army). The goal of this blog has changed a bit over the years. Initially, it was a general blog where I wrote thoughts down. Then it changed to focus on the Oracle Database, Oracle Enterprise Manager, and eventually Oracle GoldenGate.
If you want to follow me on a more timely manner, I can be followed on twitter at @dbasolved or on LinkedIn under “Bobby Curtis MBA”.
Having read this I believed it was very informative.
I appreciate you taking the time and energy to put this article
together. I once again find myself spending a significant amount of time both reading
and commenting. But so what, it was still worthwhile!