Question Details

No question body available.

Tags

python pandas etl hana

Answers (2)

February 7, 2026 Score: 1 Rep: 349 Quality: Low Completeness: 80%

These two lines don't have a logical link in the code:

    cursor = connection.cursor()
    cursor.arraysize = 50000
    pd.readsqlquery(query,conection,chunksize=10000)

hence, the readsqlquery doesn't use your cursor.

you should distinguishe between the chunksize and chunksize :

the chunksize is related to Panads but the chunksize is related to DB-API.

Also:

dfchunk.todict("records")

here every data frame, row and column will be converted to

{ "col1": pythonobj, ...}

It consumes a significant amount of processor resources.

My suggestions are:

  1. Use cursor in query like: cursor.execute("SELECT * FROM tablename"

  2. Loop on get_rows=cursor.fetchmany()

  3. Loop for every row in get rows and process the row as you want.

  4. Close the cursor cursor.close()

February 7, 2026 Score: 1 Rep: 1 Quality: Low Completeness: 80%

You have conection instead of connection in pd.readsqlquery().

You are fetching chunks and then converting each chunk to dictionaries .todict("records"), then extending a list. this creates a lot of overhead and intermediate objects, which goes against the purpose of chunking

Also, while you set cursor.arraysize, pandas doesn't automatically use your cursor when you pass a connection object. It creates its own cursor internally

here's how id do it:

from hdbcli import dbapi
import pandas as pd

connection = dbapi.connect( address="-----------", port="-----------", user="-----------", password="-----------" )

cursor = connection.cursor() cursor.arraysize = 50000

query = "SELECT * FROM tablename LIMIT 10000"

df = pd.readsqlquery(query, connection)

chunks = [] for chunk in pd.readsqlquery(query, connection, chunksize=10000): chunks.append(chunk) df = pd.concat(chunks, ignore_index=True)

cursor.execute(query) columns = [desc[0] for desc in cursor.description] df = pd.DataFrame(cursor.fetchall(), columns=columns)

cursor.close() connection.close()

  1. no unnecessary dict conversion

  2. correct arraysize usage

  3. less overhead