Using Snowflake Data with Pandas

By Eloi Sanchez on 20 Mar, 2023

<span id="hs_cos_wrapper_name" class="hs_cos_wrapper hs_cos_wrapper_meta_field hs_cos_wrapper_type_text" style="" data-hs-cos-general-type="meta_field" data-hs-cos-type="text" >Using Snowflake Data with Pandas</span>

Today we are going to see how to use our Snowflake Data with Pandas in Python.

The data that we are going to load into Pandas is the following table that we created in the previous blog and is stored in our Snowflake account.

ID	NAME	EMAIL	NUM
1	Seana Maunders	smaunders0@samsung.com	555-983-6281
2	Leanna Entres	lentres1@miibeian.gov.cn	956-437-2011
3	Cherin Geydon	cgeydon2@stumbleupon.com	692-616-2135
4	Klarrisa Thurlbeck	kthurlbeck3@ocn.ne.jp	767-554-5346
5	Dwayne Hurling	dhurling4@cbc.ca	418-885-5011
6	Rochette Ballham	rballham5@rambler.ru	325-152-2767
7	Brenna Fruish	bfruish6@google.com.hk	766-239-1379
8	Al Deare	adeare7@networksolutions.com	845-560-8781
9	Felicdad McClarence	fmcclarence8@salon.com	129-698-9707
10	Kiley Readhead	kreadhead9@instagram.com	107-619-2098

Mock data that we are going to use

Initially, we have to install the version of the Snowflake connector that is capable of working with pandas

pip install "snowflake-connector-python[pandas]"

Then, in pyhton we can create the Snowflake connection (see the previous blog) to query the database.

import snowflake.connector

con = snowflake.connector.connect(
user='python_user',
password='python_password',
account='<account_identifier>',
warehouse='compute_wh',
database='python_database',
schema='public'
)

Now we must create a cursor object, which is basically used to execute the queries and store their information and results.

cur = con.cursor()

cur.execute('SELECT * FROM cool_table;')

And finally, we can use two methods in order to fetch the retrieved data from the cursor object. Either we can use cur.fetch_pandas_all(), so

df = cur.fetch_pandas_all()

which returns all the results in a single dataframe, or we can use cur.fetch_pandas_batches(), which returns an iterator that produces subsets of all the row results of the query

for df in cur.fetch_pandas_batches():

    do_dataframe_stuff(df)

The result of the two is, as expected, a dataframe (or a set of dataframes) with the information that we could see in the table above. Using these cursor methods we obtain the dataframe with the column names as determined by the SQL query. So, overall, pretty useful for our Data Analysis python scripts.

And for those that were expecting a very long and intricate explanation on how we could do it: Yes! Using your Snowflake Data in Pandas really is this simple!

Complete code

Below you can see the complete code used for this blog.

import snowflake.connector

# Create the Snowflake-Python connection
con = snowflake.connector.connect(
user='python_user',
password='python_password',
account='<account_identifier>', # Change this to your own account
warehouse='compute_wh',
database='python_database',
schema='public'
)

# Create the cursor and execute the query
cur = con.cursor()
cur.execute('SELECT * FROM cool_table;')

# Gather all the results in a single dataframe
df = cur.fetch_pandas_all()
print(df)

# Gather the results loaded into batches of smaller size
for df in cur.fetch_pandas_batches():
print(df)

Using Snowflake Data with Pandas

Complete code

You May Also Like

Vector Databases from 1000 Meters

5 Problems with your data lake, and how to solve them

Using Dagster as a data orchestration tool: some examples