PyIceberg

PyIceberg provides a native Python API to work with Iceberg tables directly.

This guide shows you how to get started using PyIceberg with the BoringData catalog.

First install the following libraries:

pip install pyiceberg[s3fs] duckdb requests

Get the catalog:

from pyiceberg.table import StaticTable
import requests
import duckdb 

duckdb.sql(requests.get("https://catalog.boringdata.io/start.sql").text)
duckdb.sql("select table_name from boringdata.metadata.catalog;")

Chose a table and query it:

table_name = "taxi.yellow_tripdata"
StaticTable.from_metadata(
    metadata_location= duckdb.sql(f"""select metadata_file from boringdata.metadata.catalog where table_name= '{table_name}' ;""").fetchall()[0][0],
    properties=requests.get("https://catalog.boringdata.io/start_pyiceberg.json").json()
)