Skip to main content

14 posts tagged with "cayenne"

Cayenne (Vortex) data accelerator related topics and usage

View All Tags

Spice v2.0-rc.2 (Apr 10, 2026)

ยท 28 min read
Evgenii Khramkov
Senior Software Engineer at Spice AI

Announcing the release of Spice v2.0-rc.2! ๐Ÿ”ฅ

v2.0.0-rc.2 is the second release candidate for advanced testing of v2.0, building on v2.0.0-rc.1.

Highlights in this release candidate include:

  • Distributed Spice Cayenne Query and Write Improvements with data-local query routing and partition-aware write-through
  • DataFusion v52.4.0 Upgrade with aligned arrow-rs, datafusion-federation, and datafusion-table-providers
  • MERGE INTO for Spice Cayenne catalog tables with distributed support across executors
  • PARTITION BY Support for Cayenne enabling SQL-defined partitioning in CREATE TABLE statements
  • ADBC Data Connector & Catalog with full query federation, BigQuery support, and schema/table discovery
  • Databricks Lakehouse Federation Improvements with improved reliability, resilience, DESCRIBE TABLE fallback, and source-native type parsing
  • Delta Lake Column Mapping supporting Name and Id mapping modes
  • HTTP Pagination support for paginated API endpoints in the HTTP data connector
  • New Catalog Connectors for PostgreSQL, MySQL, MSSQL, and Snowflake
  • JSON Ingestion Improvements with single-object support, soda (Socrata Open Data) format support, json_pointer extraction, and auto-detection
  • Per-Model Rate-Limited AI UDF Execution for controlling concurrent AI function invocations
  • Dependency upgrades including Turso v0.5.3, iceberg-rust v0.9, and Vortex improvements

What's New in v2.0.0-rc.2โ€‹

Distributed Cayenne Query and Write Improvementsโ€‹

Distributed query for Cayenne-backed tables now has better partition awareness for both reads and writes.

Key improvements:

  • Data-Local Query Routing: Cayenne catalog queries can now be routed to executors that hold the relevant partitions, improving distributed query efficiency.
  • Partition-Aware Write Through: Scheduler-side Flight DoPut ingestion now splits partitioned Cayenne writes and forwards them to the responsible executors instead of routing through a single raw-forward path.
  • Dynamic Partition Assignment: Newly observed partitions can be added and assigned atomically as data arrives, with persisted partition metadata for future routing.
  • Better Cluster Coordination: Partition management is now separated for accelerated and federated tables, improving routing behavior for distributed Cayenne catalog workloads.
  • Distributed UPDATE/DELETE DML: UPDATE and DELETE statements for Cayenne catalog tables are now forwarded to all executors in distributed mode, with all executors required to succeed.
  • Distributed runtime.task_history: Task history is now replicated across the distributed cluster for observability.
  • RefreshDataset Control Stream: Dataset refresh operations are now distributed via the control stream to executors.
  • Executor DDL Sync: When an executor connects, it receives DDL for all existing tables, ensuring late-joining executors have full table state.

MERGE INTO for Spice Cayenneโ€‹

Spice now supports MERGE INTO statements for Cayenne catalog tables, enabling upsert-style data operations with full distributed support.

Key improvements:

  • MERGE INTO Support: Execute MERGE INTO statements against Cayenne catalog tables for combined insert/update/delete operations.
  • Distributed MERGE: MERGE operations are automatically distributed across executors in cluster mode.
  • Data Safety: Duplicate source keys are detected and prevented to avoid data loss during MERGE operations.
  • Chunked Delete Filters: Large MERGE delete filter lists are chunked to prevent stack overflow with Vortex IN-list expressions.

PARTITION BY Support for Cayenneโ€‹

SQL Partition Management: Spice now supports PARTITION BY for Cayenne-backed CREATE TABLE statements, enabling partition definitions to be expressed directly in SQL and persisted in the Cayenne catalog.

Key improvements:

  • SQL Partition Definition: Define Cayenne table partitioning directly in SQL using CREATE TABLE ... PARTITION BY (...).
  • Partition Validation: Partition expressions are parsed and validated during DDL analysis before table creation.
  • Persisted Partition Metadata: Partition metadata is stored in the Cayenne catalog and can be reloaded by the runtime after restart.
  • Distributed DDL Support: Partition metadata is forwarded when CREATE TABLE is distributed to executors in cluster mode.
  • Improved Type Support: Partition utilities now support newer string scalar variants such as Utf8View.

Example:

CREATE TABLE events (id INT, region TEXT, ts TIMESTAMP) PARTITION BY (region)

Catalog Connector Enhancementsโ€‹

Spice now includes additional catalog connectors for major database systems, improving schema discovery and federation workflows across external data systems.

Key improvements:

  • New Catalog Connectors: Added catalog connectors for PostgreSQL, MySQL, MSSQL, and Snowflake.
  • Schema and Table Discovery: Connectors use native metadata catalogs such as information_schema / INFORMATION_SCHEMA to discover schemas and tables.
  • Improved Federation Workflows: These connectors make it easier to expose external database metadata through Spice for cross-system federation scenarios.
  • PostgreSQL Partitioned Tables: Fixed schema discovery for PostgreSQL partitioned tables.

Example PostgreSQL catalog configuration:

catalogs:
- from: pg
name: pg
include:
- 'public.*'
params:
pg_host: localhost
pg_port: 5432
pg_user: postgres
pg_pass: ${secrets:POSTGRES_PASSWORD}
pg_db: my_database
pg_sslmode: disable

JSON Ingestion Improvementsโ€‹

JSON ingestion is now more flexible and robust.

Key improvements:

  • More JSON Formats: Added support for single-object JSON documents, auto-detected JSON formats, and Socrata SODA responses.
  • json_pointer Extraction: Extract nested payloads before schema inference and reading using RFC 6901 JSON Pointer syntax.
  • Better Auto-Detection: JSON format detection now handles arrays, objects, JSONL, and BOM-prefixed input more reliably, including single multi-line objects.
  • SODA Support: Added schema extraction and data conversion for Socrata Open Data API responses.
  • Broader Compatibility: Improved handling for BOM-prefixed files, CRLF-delimited JSONL, nested payloads, mixed structures, and wrapped documents.

Example using json_pointer to extract nested data from an API response:

datasets:
- from: https://api.example.com/v1/data
name: users
params:
json_pointer: /data/users

DataFusion v52.4.0 Upgradeโ€‹

Apache DataFusion has been upgraded from v52.2.0 to v52.4.0, with aligned updates across arrow-rs, datafusion-federation, and datafusion-table-providers.

Key improvements:

  • DataFusion v52.4.0: Brings the latest fixes and compatibility improvements across query planning and execution.
  • Strict Overflow Handling: try_cast_to now uses strict cast to return errors on overflow instead of silently producing NULL values.
  • Federation Fix: Fixed SQL unparsing for Inexact filter pushdown with aliases.
  • Partial Aggregation Optimization: Improved partial aggregation performance for FlightSQLExec.

Dependency Upgradesโ€‹

DependencyVersion / Update
Turso (libsql)v0.5.3 (from v0.4.4)
iceberg-rustv0.9
VortexMap type support, stack-safe IN-lists
arrow-rsArrow v57.2.0
datafusion-federationUpdated for DataFusion v52.4.0 alignment
datafusion-table-providersUpdated for DataFusion v52.4.0 alignment
datafusion-ballistaBumped to fix BatchCoalescer schema mismatch panic

Other Improvementsโ€‹

  • Cayenne released as RC: Cayenne data accelerator is now promoted to release candidate status.

  • File Update Acceleration Mode: Added mode: file_update acceleration mode for file-based data refresh.

  • spice completions Command: New CLI command for generating shell completion scripts, with auto-detection of shell directory.

  • --endpoint Flag: Added --endpoint flag to spice run with scheme-based routing for custom endpoints.

  • mTLS Client Auth: Added mTLS client authentication support to the spice sql REPL.

  • DynamoDB DML: Implemented DML (INSERT, UPDATE, DELETE) support for the DynamoDB table provider.

  • Caching Retention: Added retention policies for cached query results.

  • GraphQL Custom Auth Headers: Added custom authorization header support for the GraphQL connector.

  • ClickHouse Date32 Support: Added Date32 type support for the ClickHouse connector.

  • AWS IAM Role Source: Added iam_role_source parameter for fine-grained AWS credential configuration.

  • S3 Metadata Columns: Metadata columns renamed to _location, _last_modified, _size for consistency, with more robust handling in projected queries.

  • S3 URL Style: Added s3_url_style parameter for S3 connector URL addressing (path-style vs virtual-hosted). Useful for S3-compatible stores like MinIO:

    params:
    s3_endpoint: https://minio.local:9000
    s3_url_style: path
  • S3 Parquet Performance: Improved S3 parquet read performance.

  • HTTP Caching: Transient HTTP error responses such as 429 and 5xx are no longer cached, preventing stale error payloads from being served from cache.

  • HTTP Connector Metadata: Added response_headers as structured map data for HTTP datasets.

  • Views on_zero_results: Accelerated views now support on_zero_results: use_source to fall back to the source when no results are found:

    views:
    - name: sales_summary
    sql: |
    SELECT region, SUM(amount) as total
    FROM sales
    GROUP BY region
    acceleration:
    enabled: true
    on_zero_results: use_source
  • Flight DoPut Ingestion Metrics: Added rows_written and bytes_written metrics for Flight DoPut / ADBC ETL ingestion.

  • EXPLAIN ANALYZE Metrics: Added metrics for EXPLAIN ANALYZE in FlightSQLExec.

  • Scheduler Executor Metrics: Added scheduler_active_executors_count metric for monitoring active executors.

  • Query Memory Limit: Updated default query memory limit from 70% to 90%, with GreedyMemoryPool for improved memory management.

  • MetastoreTransaction Support: Added transaction support to prevent concurrent metastore transaction conflicts.

  • Iceberg REST Catalog: Coerce unsupported Arrow types to Iceberg v2 equivalents in the REST catalog API.

  • CDC Cache Invalidation: Improved cache invalidation for CDC-backed datasets.

  • Spice.ai Connector Alignment: Parameter names aligned across catalog and data connectors for Spice.ai Cloud.

  • Cayenne File Size: Cayenne now correctly respects the configured target file size (defaults to 128MB).

  • Cayenne Primary Keys: Properly set primary_keys/on_conflict for Cayenne tables.

  • Turso Metastore Performance: Cached metastore connections and prepared statements for improved Turso and SQLite metastore performance.

  • Turso SQL Robustness: More robust SQL unparsing and date comparison handling for Turso.

  • Dictionary Type Normalization: Normalize Arrow Dictionary types for DuckDB and SQLite acceleration.

  • GitHub Connector Resilience: Improved GraphQL client resilience, performance, and ref filter handling.

  • ODBC Fix: Fixed ODBC queries silently returning 0 rows on query failure.

  • Anthropic Fixes: Fixed compatibility issues with Anthropic model provider.

  • v1/responses API Fix: The /v1/responses API now correctly preserves client instructions when system_prompt is set.

  • Shared Acceleration Snapshots: Show an error when snapshots are enabled on a shared acceleration file.

  • Distributed Mode Error Handling: Improved error handling for distributed mode and state_location configuration.

  • Helm Chart: Added support for ServiceAccount annotations and AWS IRSA example.

  • Perplexity Removed: Removed Perplexity model provider support.

  • Rust v1.93.1: Upgraded Rust toolchain to v1.93.1.

Contributorsโ€‹

Breaking Changesโ€‹

  • S3 metadata columns renamed: S3 metadata columns renamed from location, last_modified, size to _location, _last_modified, _size.
  • v1/evals API removed: The /v1/evals endpoint has been removed.
  • Perplexity removed: Perplexity model provider support has been removed.
  • Default query memory limit changed: Default query memory limit increased from 70% to 90%.

Upgradingโ€‹

To upgrade to v2.0.0-rc.2, use one of the following methods:

CLI:

spice upgrade v2.0.0-rc.2

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.2 image:

docker pull spiceai/spiceai:2.0.0-rc.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.2

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹

  • ci: fix E2E CLI upgrade test to use latest release for spiced download by @phillipleblanc in #9613
  • fix(DF): Lazily initialize BatchCoalescer in RepartitionExec to avoid schema type mismatch by @sgrebnov in #9623
  • feat: Implement catalog connectors for various databases by @lukekim in #9509
  • Refactor and clean up code across multiple crates by @lukekim in #9620
  • fix: Improve error handling for distributed mode and state_location configuration by @lukekim in #9611
  • Properly install postgres in install-postgres action by @krinart in #9629
  • fix: Use Python venv for schema validation in CI by @phillipleblanc in #9637
  • Update spicepod.schema.json by @app/github-actions in #9640
  • Update testoperator dispatch to use release/2.0 branch by @phillipleblanc in #9641
  • fix: Align CUDA asset names in Dockerfile and install tests with build output by @phillipleblanc in #9639
  • Fix expect test scripts in E2E Installation AI test by @sgrebnov in #9643
  • testoperator for partitioned arrow accelerator by @Jeadie in #9635
  • Remove default 1s refresh_check_interval from spidapter for hive datasets by @phillipleblanc in #9645
  • Fix scheduler panic and cancel race condition by @phillipleblanc in #9644
  • Align Spice.ai connector parameter names across catalog/data connectors by @lukekim in #9632
  • docs: update distribution details and add NAS support in release notes by @lukekim in #9650
  • Enable postgres-accel in CI builds for benchmarks by @sgrebnov in #9649
  • perf: Cache Turso metastore connection across operations by @penberg in #9646
  • Add 'scheduler_state_location' to spidapter by @Jeadie in #9655
  • Implement Cayenne S3 Express multi-zone live test with data validation by @lukekim in #9631
  • chore(spidapter): bump default memory limit from 8Gi to 32Gi by @phillipleblanc in #9661
  • perf: Use prepare_cached() in Turso and SQLite metastore backends by @penberg in #9662
  • Improve CDC cache invalidation by @krinart in #9651
  • Refactor Cayenne IDs to use UUIDv7 strings by @lukekim in #9667
  • fix: add liveness check for dead executors in partition routing by @Jeadie in #9657
  • fix(s3): Fix metadata column schema mismatches in projected queries by @sgrebnov in #9664
  • s3_metadata_columns tests: include test for location outside table prefix by @sgrebnov in #9676
  • docs: Update DuckDB, GCS, Git connector and Cayenne documentation by @lukekim in #9671
  • Add s3_url_style support for S3 connector URL addressing by @phillipleblanc in #9642
  • Consolidate E2E workflows and require WSL for Windows runtime by @lukekim in #9660
  • Upgrade to Rust v1.93.1 by @lukekim in #9669
  • Security fixes and improvements by @lukekim in #9666
  • feat(flight): add DoPut rows/bytes written metrics for DoPut ETL ingestion tracking by @phillipleblanc in #9663
  • Skip caching http error response + add response_headers by @krinart in #9670
  • refactor: Remove v1/evals functionality by @Jeadie in #9420
  • Make a test harness for Distributed Spice integration tests by @Jeadie in #9615
  • Enable on_zero_results: use_source for views by @krinart in #9699
  • fix(spidapter): Lower memory limit, passthrough AWS secrets, override flight URL by @peasee in #9704
  • Show an error on a shared acceleration file with snapshots enabled by @krinart in #9698
  • Fixes for anthropic by @Jeadie in #9707
  • Use max_partitions_per_executor in allocate_initial_partitions by @Jeadie in #9659
  • [SpiceDQ] Accelerations must have partition key by @Jeadie in #9711
  • Upgrade to Turso v0.5 by @lukekim in #9628
  • feat: Rename metadata columns to _location, _last_modified, _size by @phillipleblanc in #9712
  • fix: bump datafusion-ballista to fix BatchCoalescer schema mismatch panic by @phillipleblanc in #9716
  • fix: Ensure Cayenne respects target file size by @peasee in #9730
  • refactor: Make DDL preprocessing generic from Iceberg DDL processing by @peasee in #9731
  • [SpiceDQ] Distribute query of Cayenne Catalog to executors with data by @Jeadie in #9727
  • Properly set primary_keys/on_conflict for Cayenne tables by @krinart in #9739
  • Add executor resource and replica support to cloud app config by @ewgenius in #9734
  • feat: Support PARTITION BY in Cayenne Catalog table creation by @peasee in #9741
  • Update datafusion and related packages to version 52.3.0 by @lukekim in #9708
  • Route FlightSQL statement updates through QueryBuilder by @phillipleblanc in #9754
  • JSON file format improvements by @lukekim in #9743
  • [SpiceDQ] Partition Cayenne catalogs writes through to executors by @Jeadie in #9737
  • Update to DF v52.3.0 versions of datafusion & datafusion-tableproviders by @lukekim in #9756
  • Make S3 metadata column handling more robust by @sgrebnov in #9762
  • Fetch API keys from dedicated endpoint instead of apps response by @phillipleblanc in #9767
  • Update arrow-rs, datafusion-federation, and datafusion-table-providers dependencies by @phillipleblanc in #9769
  • Chunk metastore batch inserts to respect SQLite parameter limits by @phillipleblanc in #9770
  • Improve JSON SODA support by @lukekim in #9795
  • Add ADBC Data Connector by @lukekim in #9723
  • docs: Release Cayenne as RC by @peasee in #9766
  • cli[feat]: cloud mode to use region-specific endpoints by @lukekim in #9803
  • Include updated JSON formats in HTTPS connector by @lukekim in #9800
  • Flight DoPut: Partition-aware write-through forwarding by @Jeadie in #9759
  • Pass through authentication to ADBC connector by @lukekim in #9801
  • Move scheduler_state_location from adapter metadata to env var by @phillipleblanc in #9802
  • Fix Cayenne DoPut upsert returning stale data after 3+ writes by @phillipleblanc in #9806
  • Fix JSON column projection producing schema mismatch by @sgrebnov in #9811
  • Fix http connector by @krinart in #9818
  • Fix ADBC Connector build and test by @lukekim in #9813
  • Support update & delete DML for distributed cayenne catalog by @Jeadie in #9805
  • Set allow_http param when S3 endpoint uses http scheme by @phillipleblanc in #9834
  • fix: Cayenne Catalog DDL requires a connected executor in distributed mode by @Jeadie in #9838
  • fix: Add conditional put support for file:// scheduler state location by @Jeadie in #9842
  • fix: Require the DDL primary key contain the partition key by @Jeadie in #9844
  • fix: Databricks SQL Warehouse schema retrieval with INLINE disposition and async retry by @lukekim in #9846
  • Filter pushdown improvements for SqlTable by @lukekim in #9852
  • feat: add iam_role_source parameter for AWS credential configuration by @lukekim in #9854
  • Fix ODBC queries silently returning 0 rows on query failure by @lukekim in #9864
  • feat(adbc): Add ADBC catalog connector with schema/table discovery by @lukekim in #9865
  • Make Turso SQL unparsing more robust and fix date comparisons by @lukekim in #9871
  • Fix Flight/FlightSQL filter precedence and mutable query consistency by @lukekim in #9876
  • Partial Aggregation optimisation for FlightSQLExec by @lukekim in #9882
  • fix: v1/responses API preserves client instructions when system_prompt is set by @Jeadie in #9884
  • feat: emit scheduler_active_executors_count and use it in spidapter by @Jeadie in #9885
  • feat: Add custom auth header support for GraphQL connector by @krinart in #9899
  • Add --endpoint flag to spice run with scheme-based routing by @lukekim in #9903
  • When executor connects, send DDL for existing tables by @Jeadie in #9904
  • fix: Improve ADBC driver shutdown handling and error classification by @lukekim in #9905
  • fix: require all executors to succeed for distributed DML (DELETE/UPDATE) forwarding by @Jeadie in #9908
  • fix(cayenne catalog): fix catalog refresh race condition causing duplicate primary keys by @Jeadie in #9909
  • Remove Perplexity support by @Jeadie in #9910
  • Fix refresh_sql support for debezium constraints by @krinart in #9912
  • Implement DML for DynamoDBTableProvider by @lukekim in #9915
  • chore: Update iceberg-rust fork to v0.9 by @lukekim in #9917
  • Run physical optimizer on FallbackOnZeroResultsScanExec fallback plan by @sgrebnov in #9927
  • Improve Databricks error message when dataset has no columns by @sgrebnov in #9928
  • Delta Lake: fix data skipping for >= timestamp predicates by @sgrebnov in #9932
  • fix: Ensure distributed Cayenne DML inserts are forwarded to executors by @Jeadie in #9948
  • Add full query federation support for ADBC data connector by @lukekim in #9953
  • Make time_format deserialization case-insensitive by @vyershov in #9955
  • Hash ADBC join-pushdown context to prevent credential leaks in EXPLAIN plans by @lukekim in #9956
  • fix: Normalize Arrow Dictionary types for DuckDB and SQLite acceleration by @sgrebnov in #9959
  • ADBC BigQuery: Improve BigQuery dialect date/time and interval SQL generation by @lukekim in #9967
  • Make BigQueryDialect more robust and add BigQuery TPC-H benchmark support by @lukekim in #9969
  • fix: Show proper unauthorized error instead of misleading runtime unavailable by @lukekim in #9972
  • fix: Enforce target_chunk_size as hard maximum in chunking by @lukekim in #9973
  • Add caching retention by @krinart in #9984
  • fix: improve Databricks schema error detection and messages by @lukekim in #9987
  • fix: Set default S3 region for opendal operator and fix cayenne nextest by @phillipleblanc in #9995
  • fix(PostgreSQL): fix schema discovery for PostgreSQL partitioned tables by @sgrebnov in #9997
  • fix: Defer cache size check until after encoding for compressed results by @krinart in #10001
  • fix: Rewrite numeric BETWEEN to CAST(AS REAL) for Turso by @lukekim in #10003
  • fix: Handle integer time columns in append refresh for all accelerators by @sgrebnov in #10004
  • fix: preserve s3a:// scheme when building OpenDalStorageFactory with custom endpoint by @phillipleblanc in #10006
  • Fix ISO8601 time_format with Vortex/Cayenne append refresh by @sgrebnov in #10009
  • fix: Address data correctness bugs found in audit by @sgrebnov in #10015
  • fix(federation): fix SQL unparsing for Inexact filter pushdown with alias by @lukekim in #10017
  • Improve GitHub connector ref handling and resilience by @lukekim in #10023
  • feat: Add spice completions command for shell completion generation by @lukekim in #10024
  • fix: Fix data correctness bugs in DynamoDB decimal conversion and GraphQL pagination by @sgrebnov in #10054
  • Implement RefreshDataset for distributed control stream by @Jeadie in #10055
  • perf: Improve S3 parquet read performance by @sgrebnov in #10064
  • fix: Prevent write-through stalls and preserve PartitionTableProvider during catalog refresh by @Jeadie in #10066
  • feat: spice completions auto-detects shell directory and writes file by @lukekim in #10068
  • fix: Bug in DynamoDB, GraphQL, and ISO8601 refresh data handling by @sgrebnov in #10063
  • fix partial aggregation deduplication on string checking by @lukekim in #10078
  • fix: add MetastoreTransaction support to prevent concurrent transaction conflicts by @phillipleblanc in #10080
  • fix: Use GreedyMemoryPool, add spidapter query memory limit arg by @phillipleblanc in #10082
  • feat: Add metrics for EXPLAIN ANALYZE in FlightSQLExec by @lukekim in #10084
  • Use strict cast in try_cast_to to error on overflow instead of silent NULL by @sgrebnov in #10104
  • feat: Implement MERGE INTO for Cayenne catalog tables by @peasee in #10105
  • feat: Add distributed MERGE INTO support for Cayenne catalog tables by @peasee in #10106
  • Improve JSON format auto-detection for single multi-line objects by @lukekim in #10107
  • Add mode: file_update acceleration mode by @krinart in #10108
  • Coerce unsupported Arrow types to Iceberg v2 equivalents in REST catalog API by @peasee in #10109
  • fix: Update default query memory limit to 90% from 70% by @phillipleblanc in #10112
  • feat: Add mTLS client auth support to spice sql REPL by @lukekim in #10113
  • fix(datafusion-federation): report error on overflow instead of silent NULL by @sgrebnov in #10124
  • fix: Prevent data loss in MERGE when source has duplicate keys by @peasee in #10126
  • feat: Add ClickHouse Date32 type support by @sgrebnov in #10132
  • Add Delta Lake column mapping support (Name/Id modes) by @sgrebnov in #10134
  • fix: Restore Turso numeric BETWEEN rewrite lost in DML revert by @lukekim in #10139
  • fix: Enable arm64 Linux builds with fp16 and lld workarounds by @lukekim in #10142
  • fix: remove double trailing slash in Unity Catalog storage locations by @sgrebnov in #10147
  • fix: Improve GitHub GraphQL client resilience and performance by @lukekim in #10151
  • Enable reqwest compression and optimize HTTP client settings by @lukekim in #10154
  • fix: executor startup failures by @Jeadie in #10155
  • feat: Distributed runtime.task_history support by @Jeadie in #10156
  • fix: Preserve timestamp timezone in DDL forwarding to executors by @peasee in #10159
  • feat: Per-model rate-limited concurrent AI UDF execution by @Jeadie in #10160
  • fix(Turso): Reject subquery/outer-ref filter pushdown in Turso provider by @lukekim in #10174
  • Fix linux/macos spice upgrade by @phillipleblanc in #10194
  • Improve CREATE TABLE LIKE error messages, success output, EXPLAIN, and validation by @peasee in #10203
  • fix: chunk MERGE delete filters and update Vortex for stack-safe IN-lists by @peasee in #10207
  • Propagate runtime.params.parquet_page_index to Delta Lake connector by @sgrebnov in #10209
  • Properly mark dataset as Ready on Scheduler by @Jeadie in #10215
  • fix: handle Utf8View/LargeUtf8 in GitHub connector ref filters by @lukekim in #10217
  • fix(databricks): Fix schema introspection and timestamp overflow by @lukekim in #10226
  • fix(databricks): Fix schema introspection failures for non-Unity-Catalog environments by @lukekim in #10227
  • feat: Add pagination support to HTTP data connector by @lukekim in #10228
  • feat(databricks): DESCRIBE TABLE fallback and source-native type parsing for Lakehouse Federation by @lukekim in #10229
  • fix(databricks): harden HTTP retries, compression, and token refresh by @lukekim in #10232
  • feat[helm chart]: Add support for ServiceAccount annotations and AWS IRSA example by @peasee in #9833
  • fix: Log warning and fall back gracefully on Cayenne config change by @krinart in #9092
  • fix: Handle engine mismatch gracefully in snapshot fallback loop by @krinart in #9187

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0-rc.1...v2.0.0-rc.2

Spice v1.11.5 (Apr 1, 2026)

ยท 4 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.5! ๐Ÿ› ๏ธ

Spice v1.11.5 is a patch release improving on_zero_results: use_source fallback performance, Delta Lake timestamp predicate data skipping, S3 Parquet read performance, PostgreSQL partitioned table support, Cayenne target file size handling, and preparing the CLI for v2.0 runtime upgrades.

What's New in v1.11.5โ€‹

on_zero_results: use_source Fallback Performance Improvementโ€‹

Improved the on_zero_results: use_source fallback path to run DataFusion's physical optimizer on the federated scan plan (#9927). The fallback path now runs SessionState::physical_optimizers() rules on the federated scan plan before execution, enabling parallel file group scanning and other optimizations. This results in significantly faster fallback queries on multi-core machines, particularly for file-based data sources like Delta Lake.

Delta Lake: Improved Data Skipping for >= Timestamp Predicatesโ€‹

Delta Lake table scans with >= timestamp filters now correctly prune files that do not match the predicate (#9932), improving query performance through more effective data skipping (file-level pruning).

PostgreSQL: Partitioned Tables Supportโ€‹

The PostgreSQL data connector now supports partitioned tables (#9997) for both federated and accelerated queries.

S3 Parquet Read Performance Improvementโ€‹

Improved parquet read performance from S3 and other object stores (#10064), particularly for tables with many columns. Column data ranges are now coalesced into fewer, larger requests instead of being fetched individually, reducing the number of HTTP round-trips.

Cayenne: Ensure Target File Size is Respectedโ€‹

The Cayenne accelerator now correctly respects the configured target file size (#10071). Previously, Cayenne could produce many small, fragmented Vortex files; with this fix, files are written at the expected target size, improving storage efficiency and query performance.

CLI: Support for v2.0 Runtime Upgradesโ€‹

The Spice CLI can now upgrade to v2.0 runtime versions. This enables upgrading to v2.0 release candidates and, once released, the v2.0 stable runtime.

spice upgrade v2.0.0-rc.1

Running spice upgrade without a version will upgrade to the latest stable version, including v2.0 once released.

Note: Native Windows runtime builds will no longer be provided in v2.0. Use WSL for local development instead.

Contributorsโ€‹

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

No new cookbook recipes.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.11.5, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.5 image:

docker pull spiceai/spiceai:1.11.5

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.5

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹

  • fix(runtime): Run physical optimizer on FallbackOnZeroResultsScanExec fallback plan by @sgrebnov in #9927
  • fix(delta_lake): Fix data skipping for >= timestamp predicates by @sgrebnov in #9932
  • fix(PostgreSQL): Fix schema discovery for PostgreSQL partitioned tables by @sgrebnov in #9997
  • fix(cli): Skip models variant download for v2+ in upgrade/install by @lukekim and @sgrebnov in #10052
  • perf(s3): Improve Parquet read performance by @sgrebnov in #10064
  • fix(cayenne): Ensure Cayenne respects target file size by @krinart in #10071

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.11.4...v1.11.5

Spice v2.0-rc.1 (Mar 4, 2026)

ยท 23 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v2.0-rc.1! ๐Ÿš€

v2.0.0-rc.1 is the first release candidate for early testing of v2.0.

Highlights in this release candidate include:

  • Active-Active Highly-Available Distributed Query that is object-store-native and built on Apache Ballista, with dynamic cluster sizing, distributed ingestion, and cluster observability
  • Spice Cayenne RC with staged append writes, file-based retention deletes, composite partitioning, and distributed ingestion
  • DataFusion v52.2.0 Upgrade with sort pushdown, a new merge join, and dynamic filters
  • DDL Support for CREATE TABLE and DROP TABLE via SQL for Iceberg and Cayenne catalogs
  • DuckLake Catalog & Data Connector for lakehouse-style data management
  • GCS Data Connector (Alpha) for Google Cloud Storage
  • Rust CLI Rewrite for a unified single-binary experience
  • Dependency upgrades including DuckDB v1.4.4, delta_kernel v0.18.2, and mistral.rs

Spice v2.0 includes several breaking changes. Review the breaking changes section before upgrading.

Distribution Changesโ€‹

AI/ML support including local LLM/ML model and hosted LLM inference is now included in the default Spice build and image. The separate models build variant has been removed.

With models now included by default, the data-only distribution (without AI/ML support) is only published in nightly builds. Official production-ready data-only distributions are available exclusively through Spice Cloud and the Enterprise release.

A new Network Attached Storage (NAS) distribution with built-in SMB and NFS data connector support is also now available in nightly builds and with Spice.ai Enterprise.

Distribution / VariantOpen SourceSpice CloudEnterprise
Defaultโœ…โœ…โœ…
DataNightly onlyโœ…โœ…
NAS (SMB + NFS)Nightly onlyโŒโœ…
Metal (macOS)โœ…โœ…โœ…
CUDA (Linux)Nightly onlyโœ…โœ…
Allocator variantsNightly onlyโœ…โœ…
ODBC connectorLocal build onlyโœ…โœ…

For more details, see the Distributions documentation.

What's New in v2.0.0-rc.1โ€‹

Active-Active HA Distributed Queryโ€‹

Distributed Query exits Beta with active-active highly-available object-store-based distributed query.

Distributed query supports two execution modes:

  • Synchronous: Queries for accelerated datasets are distributed across executors and results are streamed back in real-time. Non-accelerated datasets execute only on the scheduler. Best for interactive queries where low latency is critical.
  • Asynchronous: Queries are submitted via the new HTTP-only /v1/queries API and results are materialized to object storage for later retrieval. Best for long-running analytical workloads, batch processing, and non-accelerated datasets in distributed mode.

Key improvements:

  • Dynamic Cluster Sizing: The query planner automatically adjusts parallelism based on the number of active executors in the cluster, ensuring optimal resource utilization as nodes are added or removed.
  • Distributed Ingestion: Data ingestion for partitioned accelerated tables is now distributed across executor nodes, enabling higher throughput and parallel data loading in cluster mode. Regular (non-partitioned) accelerated tables do not distribute ingestion loads.
  • Synchronous Execution on Scheduler: /v1/sql and FlightSQL queries now execute synchronously on the scheduler when appropriate, reducing inter-node overhead for queries that don't benefit from distribution.
  • Faster Failure Detection: Executor heartbeat timeout reduced from 180s to 30s, enabling the cluster to quickly detect and respond to executor failures.
  • Cluster Observability: New metrics and Grafana dashboard for monitoring distributed query clusters.

Spice Cayenne Improvementsโ€‹

The Spice Cayenne data accelerator exits Beta with significant reliability and performance improvements:

  • Staged Append Writes: WAL-based staged append writes prevent partial writes and data loss on stream errors. Batches are written to a WAL file before being committed, ensuring atomicity.
  • File-Based Retention Deletes: Time-based retention now supports file-level deletes for both position-based and primary-key tables, reducing I/O overhead compared to row-level deletion.
  • Multiple Partition Expressions: Support for composite partitioning with partition_by: [col1, col2] using hierarchical path-like keys (e.g., 2025/10/15).
  • Distributed Ingestion: Cayenne catalog now supports distributed ingestion across executor nodes in cluster mode, including UPDATE operations.
  • Improved Robustness: Fixed CDC edge case where DELETE + UPSERT sequences could produce duplicate primary keys across protected snapshots. Improved upsert handling during runtime restarts.

DataFusion v52.2.0 Upgradeโ€‹

Apache DataFusion has been upgraded to v52.2.0, bringing significant performance improvements, new query features, and enhanced extensibility.

Performance Improvements:

  • Faster CASE Expressions: Lookup-table-based evaluation for certain CASE expressions avoids repeated evaluation, accelerating common ETL patterns
  • MIN/MAX Aggregate Dynamic Filters: Queries with MIN/MAX aggregates now create dynamic filters during scan to prune files and rows as tighter bounds are discovered during execution
  • New Merge Join: Rewritten sort-merge join (SMJ) operator with speedups of three orders of magnitude in pathological cases (e.g., TPC-H Q21: minutes โ†’ milliseconds)
  • Caching Improvements: New statistics cache for file metadata avoids repeatedly recalculating statistics, significantly improving planning time. A prefix-aware list-files cache accelerates evaluating partition predicates for Hive partitioned tables
  • Improved Hash Join Filter Pushdown: Build-side hash map contents are now passed dynamically to probe-side scans for pruning files, row groups, and individual rows

Major Features:

  • Sort Pushdown to Scans: Sorts are pushed into data sources, enabling ~30x performance improvement on pre-sorted data with top-K queries. Parquet scans now reverse row group order for DESC queries on ASC-sorted files
  • TableProvider supports DELETE and UPDATE: New hooks for DELETE and UPDATE statements in the TableProvider trait, enabling Iceberg and Cayenne connectors to implement SQL DELETE and UPDATE operations
  • More Extensible SQL Planning: New RelationPlanner API for extending SQL planning for FROM clauses, enabling support for vendor-specific SQL dialects

DDL Support for Iceberg and Cayenneโ€‹

SQL Schema Management: Spice now supports CREATE TABLE and DROP TABLE DDL operations for Iceberg and Cayenne catalogs via FlightSQL and the /v1/sql API. DML validation has been updated for catalog-level writability.

DuckLake Catalog & Data Connectorโ€‹

Lakehouse-Style Data Management: New DuckLake catalog and data connector enable lakehouse-style data management with DuckDB as the metadata catalog and object storage for data files. DuckLake provides ACID transactions, time travel, and schema evolution on top of Parquet files.

GCS Data Connector (Alpha)โ€‹

Google Cloud Storage Support: New Google Cloud Storage data connector enables federated queries against data stored in GCS buckets, with Iceberg table support.

Rust CLI Rewriteโ€‹

Unified Single-Binary Experience: The Spice CLI has been completely rewritten from Go to Rust, eliminating the Go dependency and providing a single spice binary built from the same codebase as spiced. This improves startup performance, reduces distribution size, and ensures consistent behavior between CLI and runtime.

Key Features:

  • Full Feature Parity: All 27+ CLI commands re-implemented in Rust with identical behavior
  • New spice query Command: Interactive REPL for async queries via the /v1/queries API with multi-line SQL input, spinner progress indicator, Ctrl+C cancellation, and partial query ID matching
  • --output=json Flag: Machine-readable JSON output for CLI commands, enabling scripting and automation
  • spice login --output: New output modes (env, json, keychain) for flexible credential management
  • spice cloud metrics: New command for Spice Cloud deployment metrics

Models Included by Defaultโ€‹

Local LLM/ML model inference (via mistral.rs) is now included in the default Spice build. The separate models build variant has been removed. This simplifies installation and ensures all users have access to local AI inference capabilities.

Error Propagation for Dataset and Model Status APIsโ€‹

The /v1/datasets and /v1/models APIs now return structured error information when a component is in an Error state. The ?status=true query parameter must be passed to retrieve the real-time component status, including the error state and details. Previously, the status field only indicated Error with no further detail. Now, two new fields are included when ?status=true is specified:

  • error: A structured object with category, type, and code fields for programmatic error handling (e.g. { "category": "dataset", "type": "auth", "code": "dataset.auth" }).
  • error_message: A human-readable description of why the component entered an error state.

These fields are only present when ?status=true is passed and the component is in an error state.

Example /v1/datasets?status=true response:

[
{
"from": "postgres:syncs",
"name": "daily_journal",
"replication_enabled": false,
"acceleration_enabled": true,
"status": "Ready"
},
{
"from": "databricks:hive_metastore.default.messages",
"name": "messages",
"replication_enabled": false,
"acceleration_enabled": true,
"status": "Error",
"error": {
"category": "dataset",
"type": "auth",
"code": "dataset.auth"
},
"error_message": "Unable to authenticate with datasource credentials"
}
]

The spice datasets and spice models CLI commands now include an ERROR column that displays the error message for any component in an error state.

Additional Dependency Upgradesโ€‹

DependencyVersion
Ballistav52.0.0
DuckDBv1.4.4
delta_kernelv0.18.2
mistral.rsv0.7.0 (candle fork removed, now uses candle 0.9.2 from crates.io)
Turso (libsql)v0.4.4
VortexUpgraded with CASE-WHEN support
AWS SDKMultiple crates updated + APN user-agent support

Other Improvementsโ€‹

  • Spicepod v2 Support: Spicepods now support version v2, and spice init generates spicepod.yaml files with version: v2 by default while maintaining backward compatibility for existing v1 spicepods.
  • x.ai Models: x.ai models now exclusively use the /v1/responses endpoint with rate limiting support.
  • HuggingFace Chat Templates: Added support for chat templates in HuggingFace model configurations.
  • Databricks SQL Dialect: Added Databricks SQL dialect for DataFusion unparser, improving federation query generation.
  • Snowflake: Added snowflake_private_key parameter for key-pair authentication.
  • Acceleration Metrics: New rows_written, bytes_written, and dataset_acceleration_size_bytes metrics for acceleration refresh ingestion.
  • Refresh SQL UDFs: Core scalar UDFs are now enabled in refresh SQL expressions.
  • FlightSQL: Fixed TLS connection handling for grpc+tls:// endpoints with custom CA certificate support.
  • FlightSQL: Fixed schema consistency by expanding view types and verifying field names.
  • Hash Index: Fixed query correctness when hash index is used with additional filters.
  • Results Cache: Fixed schema preservation for empty query results.
  • Query Nullability: Reconciled execution stream nullability with logical plan schema.
  • Schema Evolution: Graceful handling of schema evolution mismatch errors during data refresh.
  • Internal YAML Parser: Replaced deprecated serde_yaml with an internal YAML implementation.

Spicepod v1 to v2 Changesโ€‹

Spicepod v2 introduces configuration improvements while maintaining backward compatibility with v1. Existing v1 spicepods continue to work โ€” deprecated fields are automatically migrated at load time.

Version support:

VersionStatus
v2Default. Used by spice init.
v1Supported. Deprecated fields auto-migrate.
v1beta1Removed. No longer accepted.

Configuration changes:

v1 (deprecated)v2 (preferred)Notes
runtime.results_cacheruntime.caching.sql_resultsAll fields migrate automatically. cache_max_size โ†’ max_size.
runtime.memory_limitruntime.query.memory_limitAuto-migrated. query.memory_limit takes priority if both set.
runtime.temp_directoryruntime.query.temp_directoryAuto-migrated. query.temp_directory takes priority if both set.
dataset.invalid_type_actiondataset.unsupported_type_actionAuto-migrated. v2 adds a new string variant.

New v2 fields:

  • runtime.ready_state โ€” Controls when the runtime reports ready (on_load default, or on_registration).
  • runtime.flight.do_put_rate_limit_enabled โ€” Enable/disable FlightSQL DoPut rate limiting (default: true).
  • runtime.query.spill_compression โ€” Compression for query spill files (e.g., lz4_frame).
  • runtime.scheduler.partition_management โ€” Configure partition assignment interval, limits, and timeouts for distributed mode.
  • runtime.caching.sql_results.stale_while_revalidate_ttl โ€” Serve stale cached results while revalidating in the background.
  • runtime.caching.sql_results.encoding โ€” Cache entry compression (e.g., zstd).
  • catalog.access: read_write_create โ€” New access mode for catalogs that support DDL operations.

Migration note: When both the deprecated v1 field and its v2 equivalent are set, the v2 field takes priority.

Contributorsโ€‹

Breaking Changesโ€‹

  • Cayenne and Distributed Query exit Beta: Beta warnings have been removed from documentation and code. Both features are now considered GA-ready.
  • Models included by default: The separate models build variant has been removed. Local LLM inference is now always included.
  • Spicepod version defaults to v2: New spicepods created with spice init now default to version: v2. Existing v1 spicepods remain supported, and v1beta1 is no longer accepted.
  • Windows native builds removed: Native Windows builds are no longer provided. Use WSL for local development instead.
  • Metric renames: accelerated_refresh metrics renamed to acceleration_refresh for consistency. last_refresh_time gauge renamed to include milliseconds unit.
  • Caching config renamed: ResultsCache replaced with SQLResultsCacheConfig in configuration.
  • DuckDB parameter rename: partitioned_write_flush_threshold renamed to partitioned_write_flush_threshold_rows.
  • v1/search API: The /v1/search API now always returns an array in matches, even for single results.
  • x.ai model endpoint: x.ai models now exclusively use the /v1/responses endpoint.
  • Error messages: Error messages across S3 Vectors, ScyllaDB, Snowflake, ClickHouse, and other components have been refactored for clarity and consistency.

Cookbook Updatesโ€‹

New and updated Spice Cookbook recipes:

  • Async Queries: Submit long-running queries asynchronously and retrieve results later.
  • DuckLake Catalog Connector: Use DuckLake for lakehouse-style data management with ACID transactions and time travel.

The Spice Cookbook includes 88 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v2.0.0-rc.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.1 image:

docker pull spiceai/spiceai:2.0.0-rc.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.1

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹

  • Add TPC-DS integration tests with S3 source and PostgreSQL acceleration by @phillipleblanc in #9006
  • fix(tests): fix flaky/slow/failing unit tests by @phillipleblanc in #9009
  • fix: Update benchmark snapshots for DF51 upgrade by @app/github-actions in #9008
  • fix: add feature gate to rrf TEST_EMBEDDING_MODEL by @phillipleblanc in #9017
  • fix: features check by @phillipleblanc in #9014
  • fix: Enable Cayenne acceleration snapshots by @lukekim in #9020
  • URL table support by @lukekim in #9018
  • ScyllaDB key filter by @lukekim in #8997
  • fix: Schema mismatch when using column projection with HTTP caching by @phillipleblanc in #9021
  • Add more tests for HTTP caching with columns selection by @sgrebnov in #9025
  • HTTP cache snapshots: default to time_interval and fix snapshots_creation_policy: on_change by @sgrebnov in #9026
  • Fix duplicate snapshot creation on startup by @sgrebnov in #9029
  • Add ScyllaDB and SMB to the README table by @krinart in #9034
  • Remove waiting for runtime to be ready before creating snapshot by @krinart in #9033
  • Fix snapshot on_change policy to skip when no writes occurred by @sgrebnov in #9028
  • Release notes for release release/1.11.0-rc.2 by @krinart in #9016
  • ci: use arduino/setup-protoc for official protobuf compiler by @phillipleblanc in #9036
  • ci: install unzip on aarch64 runner for arduino/setup-protoc by @phillipleblanc in #9038
  • fix: don't fail release if upload to minio fails by @phillipleblanc in #9039
  • Add missing protoc step to setup-cc action by @krinart in #9041
  • fix: Update Search integration test snapshots by @app/github-actions in #9013
  • Fix formula_1 and codebase_community in bird-bench by @Jeadie in #9000
  • Cayenne S3 Express One Zone improvements by @lukekim in #9015
  • Add zlib1g-dev to CI by @lukekim in #9052
  • Improve validation and logging for hash indexes by @lukekim in #9047
  • Upgrade Vortex with CASE-WHEN by @lukekim in #9051
  • x.ai models now exclusively use /v1/responses endpoint by @lukekim in #9400
  • Improvements for snapshot schema comparison by @krinart in #9401
  • v2.0 breaking changes by @lukekim in #9233
  • Create PartitionManagementTask for scheduler to update accelerated table partition assignments by @Jeadie in #9378
  • refactor(Cayenne): route all write orchestration through CayenneDataSink by @sgrebnov in #9402
  • Refactor benchmark to use QueryExecutor trait by @Jeadie in #9418
  • feat: Add spidapter build and release workflow by @peasee in #9427
  • Testoperator: add support for api-key when connecting to external spice instance by @sgrebnov in #9421
  • Initial implementation of Ducklake catalog & data connectors by @lukekim in #9083
  • Require aws_lc_rs since jsonwebtoken upgrade by @Jeadie in #9426
  • feat: Add spidapter tool by @peasee in #9425
  • Add release notes for 1.11.2 patch release by @sgrebnov in #9430
  • feat(spidapter): integrate system-adapter-protocol with SCP provisioning by @phillipleblanc in #9434
  • Add DuckLake TPCH E2E workflow and federated Spicepod configuration by @lukekim in #9431
  • fix(spidapter): use Flight handshake auth instead of x-api-key header by @phillipleblanc in #9435
  • [spidapter] Keep only what sparks joy by @Jeadie in #9439
  • Refactor binary operator balancing by @Jeadie in #9424
  • feat: Add Iceberg DDL support (CREATE TABLE / DROP TABLE) for default catalog override by @phillipleblanc in #9440
  • Fix Flight SQL schema consistency: expand view types and verify field names by @sgrebnov in #9438
  • Update spidapter for new system-adapter-protocol by @sgrebnov in #9442
  • docs: fix typos and syntax errors in style guide and error handling docs by @cluster2600 in #9445
  • Add acceleration refresh ingestion metrics (rows_written, bytes_written) by @phillipleblanc in #9461
  • Refactor(Cayenne): Replace CatalogError and string based errors with Snafu errors by @sgrebnov in #9403
  • Replace deprecated claude-3-5-haiku-latest with claude-haiku-4-5 by @Jeadie in #9492
  • Fix #9481: Preserve schema in results cache for empty query results by @phillipleblanc in #9485
  • Fix partition by serializing by @Jeadie in #9474
  • query: reconcile execution stream nullability with logical plan schema by @phillipleblanc in #9486
  • initial spice-cloud-client crate and spice cloud metrics --app <app-name>. by @Jeadie in #9480
  • feat: Return dataset error message in datasets API by @peasee in #9487
  • Spicebench by @lukekim in #9447
  • build(deps): consolidate dependabot dependency updates by @phillipleblanc in #9504
  • fix(cluster): route non-partitioned accelerated tables in distributed mode by @phillipleblanc in #9508
  • Enable core scalar UDFs in refresh SQL by @sgrebnov in #9502
  • Fix metrics in Spidapter again by @Jeadie in #9497
  • fix(cluster): tolerate Completed->status propagation race in distributed query handle by @phillipleblanc in #9510
  • feat: Support distributed ingestion in cayenne catalog by @peasee in #9506
  • Fix Cayenne duplicate primary keys after DELETE + UPSERT CDC sequences by @krinart in #9494
  • fix(cluster): rewrite table scans inside subqueries for distributed execution by @phillipleblanc in #9518
  • fix: Set catalog mode to readwritecreate in spidapter by @peasee in #9519
  • Upgrade AWS SDK crates & set APN user-agent in AWS SDK credential bridge by @lukekim in #8328
  • feat(runtime): add runtime ready_state on_registration semantics by @lukekim in #9522
  • fix: Add spidapter post-setup retries by @peasee in #9526
  • Make partition discovery more robust and make initialization non-blocking by @sgrebnov in #9499
  • Make lint-rust-fix support targeted packages and features by @Jeadie in #9511
  • Handle new Cloud SCP API by @Jeadie in #9532
  • Refactor and simplify streaming benchmarks by @krinart in #9405
  • fix: ensure spidapter only increments attempts on failures by @peasee in #9534
  • feat: Support specifying app resources in spidapter by @peasee in #9536
  • test(runtime): Spice Cayenne DDL integration test by @lukekim in #9535
  • fix: Handle schema evolution mismatch errors during data refresh by @lukekim in #9527
  • fix: resolve clippy lint warnings by @phillipleblanc in #9547
  • pr-builds --tag <TAG> for build_and_release.yml by @Jeadie in #9507
  • Add --output flag to spice login with env/json/keychain modes by @Jeadie in #9541
  • Don't use 'PartitionedTableScanRewrite' in async distributed query by @Jeadie in #9548
  • feat(spidapter): add local backend mode with single executor by @phillipleblanc in #9531
  • support chat template in HF by @Jeadie in #9543
  • fix(cayenne): stream PK retention deletes and run OOM regression in CI by @phillipleblanc in #9533
  • cayenne: Staged append writes to prevent partial writes and data loss on stream error by @sgrebnov in #9491
  • AcceleratedTable::scan use FederatedTable::scan when ClusterRole::Scheduler by @Jeadie in #9550
  • Upgrade to delta-kernel-rs v0.18.2 by @lukekim in #9528
  • Run cayenne tests as part of PR CI by @sgrebnov in #9554
  • Upgrade to DataFusion v52.2.0 by @lukekim in #9419
  • Remove Snapshot Compaction + Add snapshot existence check by @krinart in #9523
  • Update dependencies by @lukekim in #9566
  • fix: Update benchmark snapshots by @app/github-actions in #9565
  • fix: Compare Cayenne table configuration on startup by @peasee in #9529
  • Make Refresh::refresh_sql more robust to alterations over time. by @Jeadie in #9549
  • fix: Update datafusion-table-providers dependency to latest revision by @lukekim in #9574
  • Unset AWS_ENDPOINT_URL when empty by @krinart in #9575
  • fix: allow BytesProcessedExec repartitioning for unordered input by @lukekim in #9540
  • Sanitize DataFusion errors by @lukekim in #9530
  • Add conditional logging for partition assignments by @Jeadie in #9577
  • use 'properly early exit on SIGTERM' by @Jeadie in #9573
  • Update datafusion to 52.2.0 by @phillipleblanc in #9582
  • Ensure we query one and only one partition per request by @Jeadie in #9416
  • feat: Add support for Spicepod version v2 by @lukekim in #9583
  • [SpiceDQ] Improve error messages; Avoid race condition on allocate_initial_partitions. by @Jeadie in #9579
  • Update ballista dependencies to latest 52.0.0 revision by @lukekim in #9581
  • Fix Databricks spark_connect mode always disabled by @phillipleblanc in #9586
  • Support partitioning in Arrow accelerator by @Jeadie in #9571
  • Fix spice query CLI response deserialization by @phillipleblanc in #9588
  • fix: Update benchmark snapshots by @app/github-actions in #9584
  • fix: Share RuntimeEnv across Cayenne read/write/delete paths for targeted list_files_cache invalidation by @sgrebnov in #9589
  • feat: Add file:// state_location support for async queries scheduler by @phillipleblanc in #9590
  • Update endgame links by @krinart in #9598

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.11.2...v2.0.0-rc.1

Spice v1.11.1 (Feb 10, 2026)

ยท 4 min read
Jack Eadie
Token Plumber at Spice AI

Announcing the release of Spice v1.11.1! ๐Ÿ› ๏ธ

v1.11.1 is a patch release improving Spice Cayenne accelerator reliability and performance, enhancing DynamoDB Streams and HTTP data connectors, and fixing issues in Federated Task History and FlightSQL.

What's New in v1.11.1โ€‹

Spice Cayenne Accelerator Improvementsโ€‹

This release includes stability and performance fixes for the Spice Cayenne accelerator:

  • Row-based Deletion Logic: Refactored row-based delete operations to use per-file deletion vectors with RoaringBitmap. Deletion scans now use Vortex-native streaming with filter pushdown and project only row indices, achieving zero data I/O for delete operations.
  • Constraints & On Conflict: constraints and on_conflict configurations are now automatically inferred from federated table metadata, enabling datasets like DynamoDB to work without explicitly defining primary_key in the Spicepod.
  • Partitioned Table Deletion: Fixed an issue where DELETE operations on partitioned Cayenne tables failed.
  • Data Integrity: Fixed two issues with acceleration snapshot handling: protected snapshots are now included in conflict detection keyset scans (preventing duplicate key creation during append refresh), and snapshot cleanup no longer deletes protected snapshots.

Data Connector Improvementsโ€‹

  • DynamoDB Streams: Added automatic re-bootstrapping when the stream lag exceeds DynamoDB shard retention (24h). Configurable via the new lag_exceeds_shard_retention_behavior parameter with values error (default), ready_before_load, or ready_after_load.
  • HTTP Connector: HTTP responses now include a response_status column (UInt16). 4xx responses (e.g., 404 Not Found) are treated as valid queryable data and cached normally. 5xx responses are retried with backoff, returned to the user, but excluded from the cache to prevent transient server errors from polluting cached results.

Other Improvementsโ€‹

  • Reliability: Added retries for SnapshotManager operations and general snapshot reliability improvements.
  • Reliability: Fixed handling of timestamp precision mismatches in query result caching.
  • Reliability: Fixed a double projection issue in federated task history queries that caused Schema error: project index out of bounds errors in cluster mode.
  • Developer Experience: Added cookie middleware support to the FlightSQL data connector.

Contributorsโ€‹

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

No major cookbook updates. The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.11.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.1 image:

docker pull spiceai/spiceai:1.11.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.1

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹

  • Cayenne: row-based delete logic improvements by @sgrebnov in #9237
  • Proper support for constraints/on_conflict in Cayenne Acceleration by @krinart in #9335
  • Retries for SnapshotManager by @krinart in #9334
  • fix(cayenne): Include protected snapshots in conflict detection keyset scan by @sgrebnov in #9176
  • fix(cayenne): Fix data loss by preserving protected snapshots during cleanup by @sgrebnov in #9182
  • Simplify retention filter expressions before pushdown by @sgrebnov in #9244
  • Fix test_retention_complex_sql by @sgrebnov in #9270
  • runtime: avoid double projection in federated task history by @phillipleblanc in #9326
  • feat(http): Return all HTTP responses as data, skip caching 5xx by @sgrebnov in #9313
  • Snapshots Improvements by @krinart in #9318
  • fix(caching): Handle timestamp precision mismatch and add more tests by @sgrebnov in #9315
  • DynamoDB Streams Table Rebootstrapping by @krinart in #9305
  • Fix Cayenne partitioned table deletion support by @sgrebnov in #9267
  • FlightSQL: add cookie middleware support by @phillipleblanc in #9282
  • Apply SchemaCastScanExec before applying changes in process_upsert_batch by @krinart in #9297

Spice v1.11.0 (Jan 28, 2026)

ยท 58 min read
William Croxson
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-stable! โšก

In Spice v1.11.0, Spice Cayenne reaches Beta status with acceleration snapshots, Key-based deletion vectors, and Amazon S3 Express One Zone support. DataFusion has been upgraded to v51 along with Arrow v57.2, and iceberg-rust v0.8.0. v1.11 adds several DynamoDB & DynamoDB Streams improvements such as JSON nesting, and adds significant improvements to Distributed Query with active-active schedulers and mTLS for enterprise-grade high-availability and secure cluster communication.

This release also adds new SMB, NFS, and ScyllaDB Data Connectors (Alpha), Prepared Statements with full SDK support (gospice, spice-rs, spice-dotnet, spice-java, spice.js, and spicepy), Google LLM Support for expanded AI inference capabilities, and significant improvements to caching, observability, and Hash Indexing for Arrow Acceleration.

What's New in v1.11.0โ€‹

Spice Cayenne Accelerator Reaches Betaโ€‹

Spice Cayenne has been promoted to Beta status with acceleration snapshots support and numerous performance and stability improvements.

Key Enhancements:

  • Key-based Deletion Vectors: Improved deletion vector support using key-based lookups for more efficient data management and faster delete operations. Key-based deletion vectors are more memory-efficient than positional vectors for sparse deletions.
  • S3 Express One Zone Support: Store Cayenne data files in S3 Express One Zone for single-digit millisecond latency, ideal for latency-sensitive query workloads that require persistence.

Improved Reliability:

  • Resolved FuturesUnordered reentrant drop crashes
  • Fixed memory growth issues related to Vortex metrics allocation
  • Metadata catalog now properly respects cayenne_file_path location
  • Added warnings for unparseable configuration values

For more details, refer to the Cayenne Documentation.

DataFusion v51 Upgradeโ€‹

Apache DataFusion has been upgraded to v51, bringing significant performance improvements, new SQL features, and enhanced observability.

DataFusion v51 ClickBench Performance

Performance Improvements:

  • Faster CASE Expression Evaluation: Expressions now short-circuit earlier, reuse partial results, and avoid unnecessary scattering, speeding up common ETL patterns
  • Better Defaults for Remote Parquet Reads: DataFusion now fetches the last 512KB of Parquet files by default, typically avoiding 2 I/O requests per file
  • Faster Parquet Metadata Parsing: Leverages Arrow 57's new thrift metadata parser for up to 4x faster metadata parsing

New SQL Features:

  • SQL Pipe Operators: Support for |> syntax for inline transforms
  • DESCRIBE <query>: Returns the schema of any query without executing it
  • Named Arguments in SQL Functions: PostgreSQL-style param => value syntax for scalar, aggregate, and window functions
  • Decimal32/Decimal64 Support: New Arrow types supported including aggregations like SUM, AVG, and MIN/MAX

Example pipe operator:

SELECT * FROM t
|> WHERE a > 10
|> ORDER BY b
|> LIMIT 5;

Improved Observability:

  • Improved EXPLAIN ANALYZE Metrics: New metrics including output_bytes, selectivity for filters, reduction_factor for aggregates, and detailed timing breakdowns

Arrow 57.2 Upgradeโ€‹

Apache Arrow has been upgraded to v57.2, bringing major performance improvements and new capabilities.

Arrow 57 Parquet Metadata Parsing Performance

Key Features:

  • 4x Faster Parquet Metadata Parsing: A rewritten thrift metadata parser delivers up to 4x faster metadata parsing, especially beneficial for low-latency use cases and files with large amounts of metadata
  • Parquet Variant Support: Experimental support for reading and writing the new Parquet Variant type for semi-structured data, including shredded variant values
  • Parquet Geometry Support: Read and write support for Parquet Geometry types (GEOMETRY and GEOGRAPHY) with GeospatialStatistics
  • New arrow-avro Crate: Efficient conversion between Apache Avro and Arrow RecordBatches with projection pushdown and vectorized execution support

DynamoDB Connector Enhancementsโ€‹

  • Added JSON nesting for DynamoDB Streams
  • Improved batch deletion handling

Distributed Query Improvementsโ€‹

High Availability Clusters: Spice now supports running multiple active schedulers in an active/active configuration for production deployments. This eliminates the scheduler as a single point of failure and enables graceful handling of node failures.

  • Multiple schedulers run simultaneously, each capable of accepting queries
  • Schedulers coordinate via a shared S3-compatible object store
  • Executors discover all schedulers automatically
  • A load balancer distributes client queries across schedulers

Example HA configuration:

runtime:
scheduler:
state_location: s3://my-bucket/spice-cluster
params:
region: us-east-1

mTLS Verification: Cluster communication between scheduler and executors now supports mutual TLS verification for enhanced security.

Credential Propagation: S3, ABFS, and GCS credentials are now automatically propagated to executors in cluster mode, enabling access to cloud storage across the distributed query cluster.

Improved Resilience:

  • Exponential backoff for scheduler disconnection recovery
  • Increased gRPC message size limit from 16MB to 100MB for large query plans
  • HTTP health endpoint for cluster executors
  • Automatic executor role inference when --scheduler-address is provided

For more details, refer to the Distributed Query Documentation.

iceberg-rust v0.8.0 Upgradeโ€‹

Spice has been upgraded to iceberg-rust v0.8.0, bringing improved Iceberg table support.

Key Features:

  • V3 Metadata Support: Full support for Iceberg V3 table metadata format
  • INSERT INTO Partitioned Tables: DataFusion integration now supports inserting data into partitioned Iceberg tables
  • Improved Delete File Handling: Better support for position and equality delete files, including shared delete file loading and caching
  • SQL Catalog Updates: Implement update_table and register_table for SQL catalog
  • S3 Tables Catalog: Implement update_table for S3 Tables catalog
  • Enhanced Arrow Integration: Convert Arrow schema to Iceberg schema with auto-assigned field IDs, _file column support, and Date32 type support

Acceleration Snapshotsโ€‹

Acceleration snapshots enable point-in-time recovery and data versioning for accelerated datasets. Snapshots capture the state of accelerated data at specific points, allowing for fast bootstrap recovery and rollback capabilities.

Key Features:

  • Flexible Triggers: Configure when snapshots are created based on time intervals or stream batch counts
  • Automatic Compaction: Reduce storage overhead by compacting older snapshots (DuckDB only)
  • Bootstrap Integration: Snapshots can reset cache expiry on load for seamless recovery (DuckDB with Caching refresh mode)
  • Smart Creation Policies: Only create snapshots when data has actually changed

Example configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: my_dataset
acceleration:
enabled: true
engine: cayenne
mode: file
snapshots: enabled
snapshots_trigger: time_interval
snapshots_trigger_threshold: 1h
snapshots_creation_policy: on_changed

Snapshots API and CLI: New API endpoints and CLI commands for managing snapshots programmatically.

CLI Commands:

# List all snapshots for a dataset
spice acceleration snapshots taxi_trips

# Get details of a specific snapshot
spice acceleration snapshot taxi_trips 3

# Set the current snapshot for rollback (requires runtime restart)
spice acceleration set-snapshot taxi_trips 2

HTTP API Endpoints:

MethodEndpointDescription
GET/v1/datasets/{dataset}/acceleration/snapshotsList all snapshots for a dataset
GET/v1/datasets/{dataset}/acceleration/snapshots/{id}Get details of a specific snapshot
POST/v1/datasets/{dataset}/acceleration/snapshots/currentSet the current snapshot for rollback

For more details, refer to the Acceleration Snapshots Documentation.

Caching Acceleration Mode Improvementsโ€‹

The Caching Acceleration Mode introduced in v1.10.0 has received significant performance optimizations and reliability fixes in this release.

Performance Optimizations:

  • Non-blocking Cache Writes: Cache misses no longer block query responses. Data is written to the cache asynchronously after the query returns, reducing query latency for cache miss scenarios.
  • Batch Cache Writes: Multiple cache entries are now written in batches rather than individually, significantly improving write throughput for high-volume cache operations.

Reliability Fixes:

  • Correct SWR Refresh Behavior: The stale-while-revalidate (SWR) pattern now correctly refreshes only the specific entries that were accessed instead of refreshing all stale rows in the dataset. This prevents unnecessary source queries and reduces load on upstream data sources.
  • Deduplicated Refresh Requests: Fixed an issue where JSON array responses could trigger multiple redundant refresh operations. Refresh requests are now properly deduplicated.
  • Fixed Cache Hit Detection: Resolved an issue where queries that didn't include fetched_at in their projection would always result in cache misses, even when cached data was available.
  • Unfiltered Query Optimization: SELECT * queries without filters now return cached data directly without unnecessary filtering overhead.

For more details, refer to the Caching Acceleration Mode Documentation.

Prepared Statementsโ€‹

Improved Query Performance and Security: Spice now supports prepared statements, enabling parameterized queries that improve both performance through query plan caching and security by preventing SQL injection attacks.

Key Features:

  • Query Plan Caching: Prepared statements cache query plans, reducing planning overhead for repeated queries
  • SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
  • Arrow Flight SQL Support: Full prepared statement support via Arrow Flight SQL protocol

SDK Support:

SDKSupportMin VersionMethod
gospice (Go)โœ… Fullv8.0.0+SqlWithParams() with typed constructors (Int32Param, StringParam, TimestampParam, etc.)
spice-rs (Rust)โœ… Fullv3.0.0+query_with_params() with RecordBatch parameters
spice-dotnet (.NET)โœ… Fullv0.3.0+QueryWithParams() with typed parameter builders
spice-java (Java)โœ… Fullv0.5.0+queryWithParams() with typed Param constructors (Param.int64(), Param.string(), etc.)
spice.js (JavaScript)โœ… Fullv3.1.0+query() with parameterized query support
spicepy (Python)โœ… Fullv3.1.0+query() with parameterized query support

Example (Go):

import "github.com/spiceai/gospice/v8"

client, _ := spice.NewClient()
defer client.Close()

// Parameterized query with typed parameters
results, _ := client.SqlWithParams(ctx,
"SELECT * FROM products WHERE price > $1 AND category = $2",
spice.Float64Param(10.0),
spice.StringParam("electronics"),
)

Example (Java):

import ai.spice.SpiceClient;
import ai.spice.Param;
import org.apache.arrow.adbc.core.ArrowReader;

try (SpiceClient client = new SpiceClient()) {
// With automatic type inference
ArrowReader reader = client.queryWithParams(
"SELECT * FROM products WHERE price > $1 AND category = $2",
10.0, "electronics");

// With explicit typed parameters
ArrowReader reader = client.queryWithParams(
"SELECT * FROM products WHERE price > $1 AND category = $2",
Param.float64(10.0),
Param.string("electronics"));
}

For more details, refer to the Parameterized Queries Documentation.

Spice Java SDK v0.5.0โ€‹

Parameterized Query Support for Java: The Spice Java SDK v0.5.0 introduces parameterized queries using ADBC (Arrow Database Connectivity), providing a safer and more efficient way to execute queries with dynamic parameters.

Key Features:

  • SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
  • Automatic Type Inference: Java types are automatically mapped to Arrow types (e.g., double โ†’ Float64, String โ†’ Utf8)
  • Explicit Type Control: Use the new Param class with typed factory methods (Param.int64(), Param.string(), Param.decimal128(), etc.) for precise control over Arrow types
  • Updated Dependencies: Apache Arrow Flight SQL upgraded to 18.3.0, plus new ADBC driver support

Example:

import ai.spice.SpiceClient;
import ai.spice.Param;

try (SpiceClient client = new SpiceClient()) {
// With automatic type inference
ArrowReader reader = client.queryWithParams(
"SELECT * FROM taxi_trips WHERE trip_distance > $1 LIMIT 10",
5.0);

// With explicit typed parameters for precise control
ArrowReader reader = client.queryWithParams(
"SELECT * FROM orders WHERE order_id = $1 AND amount >= $2",
Param.int64(12345),
Param.decimal128(new BigDecimal("99.99"), 10, 2));
}

Maven:

<dependency>
<groupId>ai.spice</groupId>
<artifactId>spiceai</artifactId>
<version>0.5.0</version>
</dependency>

For more details, refer to the Spice Java SDK Repository.

Google LLM Supportโ€‹

Expanded AI Provider Support: Spice now supports Google embedding and chat models via the Google AI provider, expanding the available LLM options for AI inference workloads alongside existing providers like OpenAI, Anthropic, and AWS Bedrock.

Key Features:

  • Google Chat Models: Access Google's Gemini models for chat completions
  • Google Embeddings: Generate embeddings using Google's text embedding models
  • Unified API: Use the same OpenAI-compatible API endpoints for all LLM providers

Example spicepod.yaml configuration:

models:
- from: google:gemini-2.0-flash
name: gemini
params:
google_api_key: ${secrets:GOOGLE_API_KEY}

embeddings:
- from: google:text-embedding-004
name: google_embeddings
params:
google_api_key: ${secrets:GOOGLE_API_KEY}

For more details, refer to the Google LLM Documentation (see docs PR #1286).

URL Tablesโ€‹

Query data sources directly via URL in SQL without prior dataset registration. Supports S3, Azure Blob Storage, and HTTP/HTTPS URLs with automatic format detection and partition inference.

Supported Patterns:

  • Single files: SELECT * FROM 's3://bucket/data.parquet'
  • Directories/prefixes: SELECT * FROM 's3://bucket/data/'
  • Glob patterns: SELECT * FROM 's3://bucket/year=*/month=*/data.parquet'

Key Features:

  • Automatic file format detection (Parquet, CSV, JSON, etc.)
  • Hive-style partition inference with filter pushdown
  • Schema inference from files
  • Works with both SQL and DataFrame APIs

Example with hive partitioning:

-- Partitions are automatically inferred from paths
SELECT * FROM 's3://bucket/data/' WHERE year = '2024' AND month = '01'

Enable via spicepod.yml:

runtime:
params:
url_tables: enabled

Cluster Mode Async Query APIs (experimental)โ€‹

New asynchronous query APIs for long-running queries in cluster mode:

  • /v1/queries endpoint: Submit queries and retrieve results asynchronously

OpenTelemetry Improvementsโ€‹

Unified Telemetry Endpoint: OTel metrics ingestion has been consolidated to the Flight port (50051), simplifying deployment by removing the separate OTel port (50052). The push-based metrics exporter continues to support integration with OpenTelemetry collectors.

Note: This is a breaking change. Update your configurations if you were using the dedicated OTel port 50052. Internal cluster communication now uses port 50052 exclusively.

Observability Improvementsโ€‹

Enhanced Dashboards: Updated Grafana and Datadog example dashboards with:

  • Snapshot monitoring widgets
  • Improved accelerated datasets section
  • Renamed ingestion lag charts for clarity

Additional Histogram Buckets: Added more buckets to histogram metrics for better latency distribution visibility.

For more details, refer to the Monitoring Documentation.

Hash Indexing for Arrow Acceleration (experimental)โ€‹

Arrow-based accelerations now support hash indexing for faster point lookups on equality predicates. Hash indexes provide O(1) average-case lookup performance for columns with high cardinality.

Features:

  • Primary key hash index support
  • Secondary index support for non-primary key columns
  • Composite key support with proper null value handling

Example configuration:

datasets:
- from: postgres:users
name: users
acceleration:
enabled: true
engine: arrow
primary_key: user_id
indexes:
'(tenant_id, user_id)': unique # Composite hash index

For more details, refer to the Hash Index Documentation.

SMB and NFS Data Connectorsโ€‹

Network-Attached Storage Connectors: New data connectors for SMB (Server Message Block) and NFS (Network File System) protocols enable direct federated queries against network-attached storage without requiring data movement to cloud object stores.

Key Features:

  • SMB Protocol Support: Connect to Windows file shares and Samba servers with authentication support
  • NFS Protocol Support: Connect to Unix/Linux NFS exports for direct data access
  • Federated Queries: Query Parquet, CSV, JSON, and other file formats directly from network storage with full SQL support
  • Acceleration Support: Accelerate data from SMB/NFS sources using DuckDB, Spice Cayenne, or other accelerators

Example spicepod.yaml configuration:

datasets:
# SMB share
- from: smb://fileserver/share/data.parquet
name: smb_data
params:
smb_username: ${secrets:SMB_USER}
smb_password: ${secrets:SMB_PASS}

# NFS export
- from: nfs://nfsserver/export/data.parquet
name: nfs_data

For more details, refer to the Data Connectors Documentation.

ScyllaDB Data Connectorโ€‹

A new data connector for ScyllaDB, the high-performance NoSQL database compatible with Apache Cassandra. Query ScyllaDB tables directly or accelerate them for faster analytics.

Example configuration:

datasets:
- from: scylladb:my_keyspace.my_table
name: scylla_data
acceleration:
enabled: true
engine: duckdb

For more details, refer to the ScyllaDB Data Connector Documentation.

Flight SQL TLS Connection Fixesโ€‹

TLS Connection Support: Fixed TLS connection issues when using grpc+tls:// scheme with Flight SQL endpoints. Added support for custom CA certificate files via the new flightsql_tls_ca_certificate_file parameter.

Developer Experience Improvementsโ€‹

  • Turso v0.3.2 Upgrade: Upgraded Turso accelerator for improved performance and reliability
  • Rust 1.91 Upgrade: Updated to Rust 1.91 for latest language features and performance improvements
  • Spice Cloud CLI: Added spice cloud CLI commands for cloud deployment management
  • Improved Spicepod Schema: Improved JSON schema generation for better IDE support and validation
  • Acceleration Snapshots: Added configurable snapshots_create_interval for periodic acceleration snapshots independent of refresh cycles
  • Tiered Caching with Localpod: The Localpod connector now supports caching refresh mode, enabling multi-layer acceleration where a persistent cache feeds a fast in-memory cache
  • GitHub Data Connector: Added workflows and workflow runs support for GitHub repositories
  • NDJSON/LDJSON Support: Added support for Newline Delimited JSON and Line Delimited JSON file formats

Additional Improvements & Bug Fixesโ€‹

  • Model Listing: New functionality to list available models across multiple AI providers
  • DuckDB Partitioned Tables: Primary key constraints now supported in partitioned DuckDB table mode
  • Post-refresh Sorting: New on_refresh_sort_columns parameter for DuckDB enables data ordering after writes
  • Improved Install Scripts: Removed jq dependency and improved cross-platform compatibility
  • Better Error Messages: Improved error messaging for bucket UDF arguments and deprecated OpenAI parameters
  • Reliability: Fixed DynamoDB IAM role authentication with new dynamodb_auth: iam_role parameter
  • Reliability: Fixed cluster executors to use scheduler's temp_directory parameter for shuffle files
  • Reliability: Initialize secrets before object stores in cluster executor mode
  • Reliability: Added page-level retry with backoff for transient GitHub GraphQL errors
  • Performance: Improved statistics for rewritten DistributeFileScanOptimizer plans
  • Developer Experience: Added max_message_size configuration for Flight service

Contributorsโ€‹

Breaking Changesโ€‹

OTel Ingestion Port Changeโ€‹

OTel ingestion has been moved to the Flight port (50051), removing the separate OTel port 50052. Port 50052 is now used exclusively for internal cluster communication. Update your configurations if you were using the dedicated OTel port.

Distributed Query Cluster Mode Requires mTLSโ€‹

Distributed query cluster mode now requires mTLS for secure communication between cluster nodes. This is a security enhancement to prevent unauthorized nodes from joining the cluster and accessing secrets.

Migration Steps:

  1. Generate certificates using spice cluster tls init and spice cluster tls add
  2. Update scheduler and executor startup commands with --node-mtls-* arguments
  3. For development/testing, use --allow-insecure-connections to opt out of mTLS

Renamed CLI Arguments:

Old NameNew Name
--cluster-mode--role
--cluster-ca-certificate-file--node-mtls-ca-certificate-file
--cluster-certificate-file--node-mtls-certificate-file
--cluster-key-file--node-mtls-key-file
--cluster-address--node-bind-address
--cluster-advertise-address--node-advertise-address
--cluster-scheduler-url--scheduler-address

Removed CLI Arguments:

  • --cluster-api-key: Replaced by mTLS authentication

Cookbook Updatesโ€‹

New ScyllaDB Data Connector Recipe: New recipe demonstrating how to use the ScyllaDB Data Connector. See ScyllaDB Data Connector Recipe for details.

New SMB Data Connector Recipe: New recipe demonstrating how to use the SMB Data Connector. See SMB Data Connector Recipe for details.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.11.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.0 image:

docker pull spiceai/spiceai:1.11.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.0

AWS Marketplace:

Spice is available in the AWS Marketplace.

Dependenciesโ€‹

What's Changedโ€‹

Changelogโ€‹

Spice v1.11.0-rc.2 (Jan 22, 2026)

ยท 24 min read
Viktor Yershov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-rc.2! โญ

v1.11.0-rc.2 is the second release candidate for advanced test of v1.11. It brings Spice Cayenne to Beta status with acceleration snapshots support, a new ScyllaDB Data Connector, upgrades to DataFusion v51, Arrow 57.2, and iceberg-rust v0.8.0. It includes significant improvements to distributed query, caching, and observability.

What's New in v1.11.0-rc.2โ€‹

Spice Cayenne Accelerator Reaches Betaโ€‹

Spice Cayenne has been promoted to Beta status with acceleration snapshots support and numerous stability improvements.

Improved Reliability:

  • Fixed timezone database issues in Docker images that caused acceleration panics
  • Resolved FuturesUnordered reentrant drop crashes
  • Fixed memory growth issues related to Vortex metrics allocation
  • Metadata catalog now properly respects cayenne_file_path location
  • Added warnings for unparseable configuration values

Example configuration with snapshots:

datasets:
- from: s3://my-bucket/data.parquet
name: my_dataset
acceleration:
enabled: true
engine: cayenne
mode: file

DataFusion v51 Upgradeโ€‹

Apache DataFusion has been upgraded to v51, bringing significant performance improvements, new SQL features, and enhanced observability.

DataFusion v51 ClickBench Performance

Performance Improvements:

  • Faster CASE Expression Evaluation: Expressions now short-circuit earlier, reuse partial results, and avoid unnecessary scattering, speeding up common ETL patterns
  • Better Defaults for Remote Parquet Reads: DataFusion now fetches the last 512KB of Parquet files by default, typically avoiding 2 I/O requests per file
  • Faster Parquet Metadata Parsing: Leverages Arrow 57's new thrift metadata parser for up to 4x faster metadata parsing

New SQL Features:

  • SQL Pipe Operators: Support for |> syntax for inline transforms
  • DESCRIBE <query>: Returns the schema of any query without executing it
  • Named Arguments in SQL Functions: PostgreSQL-style param => value syntax for scalar, aggregate, and window functions
  • Decimal32/Decimal64 Support: New Arrow types supported including aggregations like SUM, AVG, and MIN/MAX

Example pipe operator:

SELECT * FROM t
|> WHERE a > 10
|> ORDER BY b
|> LIMIT 5;

Improved Observability:

  • Improved EXPLAIN ANALYZE Metrics: New metrics including output_bytes, selectivity for filters, reduction_factor for aggregates, and detailed timing breakdowns

Arrow 57.2 Upgradeโ€‹

Spice has been upgraded to Apache Arrow Rust 57.2.0, bringing major performance improvements and new capabilities.

Arrow 57 Parquet Metadata Parsing Performance

Key Features:

  • 4x Faster Parquet Metadata Parsing: A rewritten thrift metadata parser delivers up to 4x faster metadata parsing, especially beneficial for low-latency use cases and files with large amounts of metadata
  • Parquet Variant Support: Experimental support for reading and writing the new Parquet Variant type for semi-structured data, including shredded variant values
  • Parquet Geometry Support: Read and write support for Parquet Geometry types (GEOMETRY and GEOGRAPHY) with GeospatialStatistics
  • New arrow-avro Crate: Efficient conversion between Apache Avro and Arrow RecordBatches with projection pushdown and vectorized execution support

iceberg-rust v0.8.0 Upgradeโ€‹

Spice has been upgraded to iceberg-rust v0.8.0, bringing improved Iceberg table support.

Key Features:

  • V3 Metadata Support: Full support for Iceberg V3 table metadata format
  • INSERT INTO Partitioned Tables: DataFusion integration now supports inserting data into partitioned Iceberg tables
  • Improved Delete File Handling: Better support for position and equality delete files, including shared delete file loading and caching
  • SQL Catalog Updates: Implement update_table and register_table for SQL catalog
  • S3 Tables Catalog: Implement update_table for S3 Tables catalog
  • Enhanced Arrow Integration: Convert Arrow schema to Iceberg schema with auto-assigned field IDs, _file column support, and Date32 type support

Acceleration Snapshotsโ€‹

Acceleration snapshots enable point-in-time recovery and data versioning for accelerated datasets. Snapshots capture the state of accelerated data at specific points, allowing for fast bootstrap recovery and rollback capabilities.

Key Feature Improvements in v1.11:

  • Flexible Triggers: Configure when snapshots are created based on time intervals or stream batch counts
  • Automatic Compaction: Reduce storage overhead by compacting older snapshots (DuckDB only)
  • Bootstrap Integration: Snapshots can reset cache expiry on load for seamless recovery (DuckDB with Caching refresh mode)
  • Smart Creation Policies: Only create snapshots when data has actually changed

Example configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: my_dataset
acceleration:
enabled: true
engine: cayenne
mode: file
snapshots: enabled
snapshots_trigger: time_interval
snapshots_trigger_threshold: 1h
snapshots_creation_policy: on_changed

Snapshots API and CLI: New API endpoints and CLI commands for managing snapshots programmatically. List, create, and restore snapshots directly from the command line or via HTTP.

For more details, refer to the Acceleration Snapshots Documentation.

ScyllaDB Data Connectorโ€‹

A new data connector for ScyllaDB, the high-performance NoSQL database compatible with Apache Cassandra. Query ScyllaDB tables directly or accelerate them for faster analytics.

Example configuration:

datasets:
- from: scylladb:my_keyspace.my_table
name: scylla_data
acceleration:
enabled: true
engine: duckdb

For more details, refer to the ScyllaDB Data Connector Documentation.

Distributed Query Improvementsโ€‹

mTLS Verification: Cluster communication between scheduler and executors now supports mutual TLS verification for enhanced security.

Credential Propagation: Azure and GCS credentials are now automatically propagated to executors in cluster mode, enabling access to cloud storage across the distributed query cluster.

Improved Resilience:

  • Exponential backoff for scheduler disconnection recovery
  • Increased gRPC message size limit from 16MB to 100MB for large query plans
  • HTTP health endpoint for cluster executors
  • Automatic executor role inference when --scheduler-address is provided

For more details, refer to the Distributed Query Documentation.

Caching Acceleration Mode Improvementsโ€‹

The Caching Acceleration Mode introduced in v1.10.0 has received significant performance optimizations and reliability fixes in this release.

Performance Optimizations:

  • Non-blocking Cache Writes: Cache misses no longer block query responses. Data is written to the cache asynchronously after the query returns, reducing query latency for cache miss scenarios.
  • Batch Cache Writes: Multiple cache entries are now written in batches rather than individually, significantly improving write throughput for high-volume cache operations.

Reliability Fixes:

  • Correct SWR Refresh Behavior: The stale-while-revalidate (SWR) pattern now correctly refreshes only the specific entries that were accessed instead of refreshing all stale rows in the dataset. This prevents unnecessary source queries and reduces load on upstream data sources.
  • Deduplicated Refresh Requests: Fixed an issue where JSON array responses could trigger multiple redundant refresh operations. Refresh requests are now properly deduplicated.
  • Fixed Cache Hit Detection: Resolved an issue where queries that didn't include fetched_at in their projection would always result in cache misses, even when cached data was available.
  • Unfiltered Query Optimization: SELECT * queries without filters now return cached data directly without unnecessary filtering overhead.

For more details, refer to the Caching Acceleration Mode Documentation.

DynamoDB Connector Enhancementsโ€‹

  • Added JSON nesting for DynamoDB Streams
  • Proper batch deletion handling

URL Tablesโ€‹

Query data sources directly via URL in SQL without prior dataset registration. Supports S3, Azure Blob Storage, and HTTP/HTTPS URLs with automatic format detection and partition inference.

Supported Patterns:

  • Single files: SELECT * FROM 's3://bucket/data.parquet'
  • Directories/prefixes: SELECT * FROM 's3://bucket/data/'
  • Glob patterns: SELECT * FROM 's3://bucket/year=*/month=*/data.parquet'

Key Features:

  • Automatic file format detection (Parquet, CSV, JSON, etc.)
  • Hive-style partition inference with filter pushdown
  • Schema inference from files
  • Works with both SQL and DataFrame APIs

Example with hive partitioning:

-- Partitions are automatically inferred from paths
SELECT * FROM 's3://bucket/data/' WHERE year = '2024' AND month = '01'

Enable via spicepod.yml:

runtime:
params:
url_tables: enabled

Cluster Mode Async Query APIs (experimental)โ€‹

New asynchronous query APIs for long-running queries in cluster mode:

  • /v1/queries endpoint: Submit queries and retrieve results asynchronously
  • Arrow Flight async support: Non-blocking query execution via Arrow Flight protocol

Observability Improvementsโ€‹

Enhanced Dashboards: Updated Grafana and Datadog example dashboards with:

  • Snapshot monitoring widgets
  • Improved accelerated datasets section
  • Renamed ingestion lag charts for clarity

Additional Histogram Buckets: Added more buckets to histogram metrics for better latency distribution visibility.

For more details, refer to the Monitoring Documentation.

Additional Improvementsโ€‹

  • Model Listing: New functionality to list available models across multiple AI providers
  • DuckDB Partitioned Tables: Primary key constraints now supported in partitioned DuckDB table mode
  • Post-refresh Sorting: New on_refresh_sort_columns parameter for DuckDB enables data ordering after writes
  • Improved Install Scripts: Removed jq dependency and improved cross-platform compatibility
  • Better Error Messages: Improved error messaging for bucket UDF arguments and deprecated OpenAI parameters

Contributorsโ€‹

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

New ScyllaDB Data Connector Recipe: New recipe demonstrating how to use ScyllaDB Data Connector. See ScyllaDB Data Connector Recipe for details.

New SMB Data Connector Recipe: New recipe demonstrating how to use ScyllaDB Data Connector. See SMB Data Connector Recipe for details.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.11.0-rc.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:v1.11.0-rc.2 image:

docker pull spiceai/spiceai:v1.11.0-rc.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

Spice is available in the AWS Marketplace.

Dependenciesโ€‹

Changelogโ€‹

Spice v1.11.0-rc.1 (Jan 6, 2026)

ยท 17 min read
Evgenii Khramkov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-rc.1! โญ

v1.11.0-rc.1 is the first release candidate for early testing of v1.11 features including Distributed Query with mTLS for enterprise-grade secure cluster communication, new SMB and NFS Data Connectors for direct network-attached storage access, Prepared Statements for improved query performance and security, Cayenne Accelerator Enhancements with Key-based deletion vectors and Amazon S3 Express One Zone support, Google LLM Support for expanded AI inference capabilities, and Spice Java SDK v0.5.0 with parameterized query support.

What's New in v1.11.0-rc.1โ€‹

Distributed Query with mTLSโ€‹

Enterprise-Grade Secure Cluster Communication: Distributed query cluster mode now enables mutual TLS (mTLS) by default for secure communication between schedulers and executors. Internal cluster communication includes highly privileged RPC calls like fetching Spicepod configuration and expanding secrets. mTLS ensures only authenticated nodes can join the cluster and access sensitive data.

Key Features:

  • Mutual TLS Authentication: All executor-to-scheduler and executor-to-executor gRPC connections on the internal cluster port (50052) are secured with mTLS, securing communication, and preventing unauthorized nodes from joining the cluster
  • Certificate Management CLI: New developer spice cluster tls init and spice cluster tls add commands for generating CA certificates and node certificates with proper SANs (Subject Alternative Names)
  • Simplified CLI Arguments: Renamed cluster arguments for clarity (--role, --scheduler-address, --node-mtls-*) with --scheduler-address implying --role executor
  • Port Separation: Public services (Flight queries, HTTP API, Prometheus metrics) remain on ports 50051, 8090, and 9090 respectively, while internal cluster services (SchedulerGrpcServer, ClusterService) are isolated on port 50052 with mTLS enforced
  • Development Mode: Use --allow-insecure-connections flag to disable mTLS requirement for local development and testing

Quick Start:

# Generate certificates for development
spice cluster tls init
spice cluster tls add scheduler1
spice cluster tls add executor1

# Start scheduler
spiced --role scheduler \
--node-mtls-ca-certificate-file ca.crt \
--node-mtls-certificate-file scheduler1.crt \
--node-mtls-key-file scheduler1.key

# Start executor
spiced --role executor \
--scheduler-address https://scheduler1:50052 \
--node-mtls-ca-certificate-file ca.crt \
--node-mtls-certificate-file executor1.crt \
--node-mtls-key-file executor1.key

For more details, refer to the Distributed Query Documentation.

SMB and NFS Data Connectorsโ€‹

Network-Attached Storage Connectors: New data connectors for SMB (Server Message Block) and NFS (Network File System) protocols enable direct federated queries against network-attached storage without requiring data movement to cloud object stores.

Key Features:

  • SMB Protocol Support: Connect to Windows file shares and Samba servers with authentication support
  • NFS Protocol Support: Connect to Unix/Linux NFS exports for direct data access
  • Federated Queries: Query Parquet, CSV, JSON, and other file formats directly from network storage with full SQL support
  • Acceleration Support: Accelerate data from SMB/NFS sources using DuckDB, Spice Cayenne, or other accelerators

Example spicepod.yaml configuration:

datasets:
# SMB share
- from: smb://fileserver/share/data.parquet
name: smb_data
params:
smb_username: ${secrets:SMB_USER}
smb_password: ${secrets:SMB_PASS}

# NFS export
- from: nfs://nfsserver/export/data.parquet
name: nfs_data

For more details, refer to the Data Connectors Documentation.

Prepared Statementsโ€‹

Improved Query Performance and Security: Spice now supports prepared statements, enabling parameterized queries that improve both performance through query plan caching and security by preventing SQL injection attacks.

Key Features:

  • Query Plan Caching: Prepared statements cache query plans, reducing planning overhead for repeated queries
  • SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
  • Arrow Flight SQL Support: Full prepared statement support via Arrow Flight SQL protocol

SDK Support:

SDKSupportMin VersionMethod
gospice (Go)โœ… Fullv8.0.0+SqlWithParams() with typed constructors (Int32Param, StringParam, TimestampParam, etc.)
spice-rs (Rust)โœ… Fullv3.0.0+query_with_params() with RecordBatch parameters
spice-dotnet (.NET)โŒ Not yet-Coming soon
spice-java (Java)โœ… Fullv0.5.0+queryWithParams() with typed Param constructors (Param.int64(), Param.string(), etc.)
spice.js (JavaScript)โŒ Not yet-Coming soon
spicepy (Python)โŒ Not yet-Coming soon

Example (Go):

import "github.com/spiceai/gospice/v8"

client, _ := spice.NewClient()
defer client.Close()

// Parameterized query with typed parameters
results, _ := client.SqlWithParams(ctx,
"SELECT * FROM products WHERE price > $1 AND category = $2",
spice.Float64Param(10.0),
spice.StringParam("electronics"),
)

Example (Java):

import ai.spice.SpiceClient;
import ai.spice.Param;
import org.apache.arrow.adbc.core.ArrowReader;

try (SpiceClient client = new SpiceClient()) {
// With automatic type inference
ArrowReader reader = client.queryWithParams(
"SELECT * FROM products WHERE price > $1 AND category = $2",
10.0, "electronics");

// With explicit typed parameters
ArrowReader reader = client.queryWithParams(
"SELECT * FROM products WHERE price > $1 AND category = $2",
Param.float64(10.0),
Param.string("electronics"));
}

For more details, refer to the Parameterized Queries Documentation.

Spice Cayenne Accelerator Enhancementsโ€‹

The Spice Cayenne data accelerator has been improved with several key enhancements:

  • KeyBased Deletion Vectors: Improved deletion vector support using key-based lookups for more efficient data management and faster delete operations. KeyBased deletion vectors are more memory-efficient than positional vectors for sparse deletions.
  • S3 Express One Zone Support: Store Cayenne data files in S3 Express One Zone for single-digit millisecond latency, ideal for latency-sensitive query workloads that require persistence.

Example spicepod.yaml configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: fast_data
acceleration:
enabled: true
engine: cayenne
mode: file
params:
# Use S3 Express One Zone for data files
cayenne_s3express_bucket: my-express-bucket--usw2-az1--x-s3

For more details, refer to the Cayenne Documentation.

Google LLM Supportโ€‹

Expanded AI Provider Support: Spice now supports Google embedding and chat models via the Google AI provider, expanding the available LLM options for AI inference workloads alongside existing providers like OpenAI, Anthropic, and AWS Bedrock.

Key Features:

  • Google Chat Models: Access Google's Gemini models for chat completions
  • Google Embeddings: Generate embeddings using Google's text embedding models
  • Unified API: Use the same OpenAI-compatible API endpoints for all LLM providers

Example spicepod.yaml configuration:

models:
- from: google:gemini-2.0-flash
name: gemini
params:
google_api_key: ${secrets:GOOGLE_API_KEY}

embeddings:
- from: google:text-embedding-004
name: google_embeddings
params:
google_api_key: ${secrets:GOOGLE_API_KEY}

For more details, refer to the Google LLM Documentation (see docs PR #1286).

Spice Java SDK v0.5.0โ€‹

Parameterized Query Support for Java: The Spice Java SDK v0.5.0 introduces parameterized queries using ADBC (Arrow Database Connectivity), providing a safer and more efficient way to execute queries with dynamic parameters.

Key Features:

  • SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
  • Automatic Type Inference: Java types are automatically mapped to Arrow types (e.g., double โ†’ Float64, String โ†’ Utf8)
  • Explicit Type Control: Use the new Param class with typed factory methods (Param.int64(), Param.string(), Param.decimal128(), etc.) for precise control over Arrow types
  • Updated Dependencies: Apache Arrow Flight SQL upgraded to 18.3.0, plus new ADBC driver support

Example:

import ai.spice.SpiceClient;
import ai.spice.Param;

try (SpiceClient client = new SpiceClient()) {
// With automatic type inference
ArrowReader reader = client.queryWithParams(
"SELECT * FROM taxi_trips WHERE trip_distance > $1 LIMIT 10",
5.0);

// With explicit typed parameters for precise control
ArrowReader reader = client.queryWithParams(
"SELECT * FROM orders WHERE order_id = $1 AND amount >= $2",
Param.int64(12345),
Param.decimal128(new BigDecimal("99.99"), 10, 2));
}

Maven:

<dependency>
<groupId>ai.spice</groupId>
<artifactId>spiceai</artifactId>
<version>0.5.0</version>
</dependency>

For more details, refer to the Spice Java SDK Repository.

OpenTelemetry Improvementsโ€‹

Unified Telemetry Endpoint: OTel metrics ingestion has been consolidated to the Flight port (50051), simplifying deployment by removing the separate OTel port (50052). The push-based metrics exporter continues to support integration with OpenTelemetry collectors.

Note: This is a breaking change. Update your configurations if you were using the dedicated OTel port 50052. Internal cluster communication now uses port 50052 exclusively.

Developer Experience Improvementsโ€‹

  • Turso v0.3.2 Upgrade: Upgraded Turso accelerator for improved performance and reliability
  • Rust 1.91 Upgrade: Updated to Rust 1.91 for latest language features and performance improvements
  • Spice Cloud CLI: Added spice cloud CLI commands for cloud deployment management
  • Improved Spicepod Schema: Enhanced JSON schema generation for better IDE support and validation
  • Acceleration Snapshots: Added configurable snapshots_create_interval for periodic acceleration snapshots independent of refresh cycles
  • Tiered Caching with Localpod: The Localpod connector now supports caching refresh mode, enabling multi-layer acceleration where a persistent cache feeds a fast in-memory cache
  • GitHub Data Connector: Added workflows and workflow runs support for GitHub repositories
  • NDJSON/LDJSON Support: Added support for Newline Delimited JSON and Line Delimited JSON file formats

Additional Improvements & Bug Fixesโ€‹

  • Reliability: Fixed DynamoDB IAM role authentication with new dynamodb_auth: iam_role parameter
  • Reliability: Fixed cluster executors to use scheduler's temp_directory parameter for shuffle files
  • Reliability: Initialize secrets before object stores in cluster executor mode
  • Reliability: Added page-level retry with backoff for transient GitHub GraphQL errors
  • Performance: Improved statistics for rewritten DistributeFileScanOptimizer plans
  • Developer Experience: Added max_message_size configuration for Flight service

Contributorsโ€‹

Breaking Changesโ€‹

OTel Ingestion Port Changeโ€‹

OTel ingestion has been moved to the Flight port (50051), removing the separate OTel port 50052. Port 50052 is now used exclusively for internal cluster communication. Update your configurations if you were using the dedicated OTel port.

Distributed Query Cluster Mode Requires mTLSโ€‹

Distributed query cluster mode now requires mTLS for secure communication between cluster nodes. This is a security enhancement to prevent unauthorized nodes from joining the cluster and accessing secrets.

Migration Steps:

  1. Generate certificates using spice cluster tls init and spice cluster tls add
  2. Update scheduler and executor startup commands with --node-mtls-* arguments
  3. For development/testing, use --allow-insecure-connections to opt out of mTLS

Renamed CLI Arguments:

Old NameNew Name
--cluster-mode--role
--cluster-ca-certificate-file--node-mtls-ca-certificate-file
--cluster-certificate-file--node-mtls-certificate-file
--cluster-key-file--node-mtls-key-file
--cluster-address--node-bind-address
--cluster-advertise-address--node-advertise-address
--cluster-scheduler-url--scheduler-address

Removed CLI Arguments:

  • --cluster-api-key: Replaced by mTLS authentication

Cookbook Updatesโ€‹

No major cookbook updates.

The Spice Cookbook includes 84 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To try v1.11.0-rc.1, use one of the following methods:

CLI:

spice upgrade --version 1.11.0-rc.1

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.0-rc.1 image:

docker pull spiceai/spiceai:1.11.0-rc.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.0-rc.1

AWS Marketplace:

๐ŸŽ‰ Spice is available in the AWS Marketplace!

What's Changedโ€‹

Changelogโ€‹

Spice v1.10.1 (Dec 15, 2025)

ยท 5 min read
Jack Eadie
Token Plumber at Spice AI

Announcing the release of Spice v1.10.1! ๐Ÿš€

v1.10.1 is a patch release with Cayenne accelerator improvements including configurable compression strategies and improved partition ID handling, isolated refresh runtime for better query API responsiveness, and security hardening. In addition, the GO SDK, gospice v8 has been released.

What's New in v1.10.1โ€‹

Cayenne Accelerator Improvementsโ€‹

Several improvements and bug fixes for the Cayenne data accelerator:

  • Compression Strategies: The new cayenne_compression_strategy parameter enables choosing between zstd for compact storage or btrblocks for encoding-efficient compression.
  • Improved Vortex Defaults: Aligned Cayenne to Vortex footer configuration for better compatibility.
  • Partition ID Handling: Improved partition ID generation to avoid potential locking race conditions.

Example spicepod.yaml configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: my_dataset
acceleration:
enabled: true
engine: cayenne
mode: file
params:
cayenne_compression_strategy: zstd # or btrblocks (default)

For more details, refer to the Cayenne Data Accelerator Documentation.

Isolated Refresh Runtimeโ€‹

Refresh tasks now run on a separate Tokio runtime isolated from the main query API. This prevents long-running or resource-intensive refresh operations from impacting query latency and ensures the /health endpoint remains responsive during heavy refresh workloads.

Security Hardeningโ€‹

Multiple security improvements have been implemented:

  • Recursion Depth Limits: Added limits to DynamoDB and S3 Vectors integrations to prevent stack overflow from deeply nested structures, mitigating potential DoS attacks.
  • Spicepod Summary API: The GET /v1/spicepods endpoint now returns summarized information instead of full spicepod.yaml representations, preventing potential sensitive information leakage.

Additional Improvements & Bug Fixesโ€‹

  • Performance: Fixed double hashing of user supplied cache keys, improving cache lookup efficiency.
  • Reliability: Fixed idle DynamoDB Stream handling for more stable CDC operations.
  • Reliability: Added warnings when multiple partitions are defined for the same table.
  • Performance: Eagerly drop cached records for results larger than max cache size.

Spice Go SDK v8โ€‹

The Spice Go SDK has been upgraded to v8 with a cleaner API, parameterized queries, and health check methods: gospice v8.0.0.

Key Features:

  • Cleaner API: New Sql() and SqlWithParams() methods with more intuitive naming.
  • Parameterized Queries: Safe, SQL-injection-resistant queries with automatic Go-to-Arrow type inference.
  • Typed Parameters: Explicit type control with constructors like Decimal128Param, TimestampParam, and more.
  • Health Check Methods: New IsSpiceHealthy() and IsSpiceReady() methods for instance monitoring.
  • Upgraded Dependencies: Apache Arrow v18 and ADBC Go driver v1.3.0.

Example usage with a local Spice runtime:

import "github.com/spiceai/gospice/v8"

// Initialize client for local runtime
spice := gospice.NewSpiceClient()
defer spice.Close()

if err := spice.Init(
gospice.WithFlightAddress("grpc://localhost:50051"),
); err != nil {
panic(err)
}

// Parameterized query (safe from SQL injection)
reader, err := spice.SqlWithParams(
ctx,
"SELECT * FROM users WHERE id = $1 AND created_at > $2",
userId,
startTime,
)

Upgrade:

go get github.com/spiceai/gospice/v8@v8.0.0

For more details, refer to the Go SDK Documentation.

Contributorsโ€‹

Breaking Changesโ€‹

  • GET /v1/spicepods no longer returns the full spicepod.yaml JSON representation. A summary is returned instead. See #8404.

Cookbook Updatesโ€‹

No major cookbook updates.

The Spice Cookbook includes 82+ recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.10.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.1 image:

docker pull spiceai/spiceai:1.10.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

๐ŸŽ‰ Spice is now available in the AWS Marketplace!

What's Changedโ€‹

Changelogโ€‹

  • Return summarized spicepods from /v1/spicepods by @phillipleblanc in #8404
  • DynamoDB tests and fixes by @lukekim in #8491
  • Use an isolated Tokio runtime for refresh tasks that is separate from the main query API by @phillipleblanc in #8504
  • fix: Avoid double hashing cache key by @peasee in #8511
  • fix: Remove unused Cayenne parameters by @peasee in #8500
  • feat: Support vortex zstd compressor by @peasee in #8515
  • Fix for idle DynamoDB Stream by @krinart in #8506
  • fix: Improve Cayenne errors, ID selection for table/partition creation by @peasee in #8523
  • Update dependencies by @phillipleblanc in #8513
  • Upgrade to gospice v8 by @lukekim in #8524
  • fix: Add recursion depth limits to prevent DoS via deeply nested data (DynamoDB + S3 Vectors) by @phillipleblanc in #8544
  • fix: Add warning when multiple partitions are defined for the same table by @peasee in #8540
  • fix: Eagerly drop cached records for results larger than max by @peasee in #8516
  • DDB Streams Integration Test + Memory Acceleration + Improved Warning by @krinart in #8520
  • fix(cluster): initialize secrets before object stores in executor by @sgrebnov in #8532
  • Show user-friendly error on empty DDB table by @krinart in #8586
  • Move 'test_projection_pushdown' to runtime-datafusion by @Jeadie in #8490
  • Fix stats for rewritten DistributeFileScanOptimizer plans by @mach-kernel in #8581

Spice v1.10.0 (Dec 9, 2025)

ยท 18 min read
William Croxson
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.10.0! โšก

Spice v1.10.0 introduces a new Caching Acceleration Mode with stale-while-revalidate (SWR) semantics for disk-persisted, low-latency queries with background refresh. This release also adds the TinyLFU eviction policy for the SQL results cache, a preview of the DynamoDB Streams connector for real-time CDC, S3 location predicate pruning for faster partitioned queries, improved distributed query execution, and multiple security hardening improvements.

What's New in v1.10.0โ€‹

Caching Acceleration Modeโ€‹

Low-Latency Queries with Background Refresh: This release introduces a new caching acceleration mode that implements the stale-while-revalidate (SWR) pattern. Queries return cached results immediately while data refreshes asynchronously in the background, eliminating query latency spikes during refresh cycles. Cached data persists to disk using DuckDB, SQLite, or Cayenne file modes.

Key Features:

  • Stale-While-Revalidate (SWR): Returns cached data immediately while refreshing in the background, reducing query latency
  • Disk Persistence: Cached results persist across restarts using DuckDB, SQLite, or Cayenne file modes
  • Configurable Refresh: Control refresh intervals with refresh_check_interval to balance freshness and source load

Recommendation: Use retention configuration with caching acceleration to ensure stale data is cleaned up over time.

Example spicepod.yaml configuration:

datasets:
- from: http://localhost:7400
name: cached_data
time_column: fetched_at
acceleration:
enabled: true
engine: duckdb
mode: file # Persist cache to disk
refresh_mode: caching
refresh_check_interval: 10m
retention_check_enabled: true
retention_period: 24h
retention_check_interval: 1h

For more details, refer to the Data Acceleration Documentation.

TinyLFU Cache Eviction Policyโ€‹

Higher Cache Hit Rates for SQL Results Cache: A new TinyLFU cache eviction policy is now available for the SQL results cache. TinyLFU is a probabilistic cache admission policy that maintains higher hit rates than LRU while keeping memory usage predictable, making it ideal for workloads with varying query frequency patterns.

Example spicepod.yaml configuration:

runtime:
caching:
sql_results:
enabled: true
eviction_policy: tiny_lfu # default: lru

For more details, refer to the Caching Documentation and the Moka TinyLFU Documentation for details of the algorithm.

DynamoDB Streams Data Connector (Preview)โ€‹

Real-Time Change Data Capture for DynamoDB: The DynamoDB connector now integrates with DynamoDB Streams for real-time change data capture (CDC). This enables continuous synchronization of DynamoDB table changes into Spice for real-time query, search, and LLM-inference.

Key Features:

  • Real-Time CDC: Automatically captures inserts, updates, and deletes from DynamoDB tables as they occur
  • Table Bootstrapping: Performs an initial full table scan before streaming changes, ensuring complete data consistency
  • Acceleration Integration: Works with refresh_mode: changes to incrementally update accelerated datasets

Note: DynamoDB Streams must be enabled on your DynamoDB table. This feature is in preview.

Example spicepod.yaml configuration:

datasets:
- from: dynamodb:my_table
name: orders_stream
acceleration:
enabled: true
refresh_mode: changes # Enable Streams capture

For more details, refer to the DynamoDB Connector Documentation.

OpenTelemetry Metrics Exporterโ€‹

Spice can now push metrics to an OpenTelemetry collector, enabling integration with platforms such as Jaeger, New Relic, Honeycomb, and other OpenTelemetry-compatible backends.

Key Features:

  • Protocol Support: Supports the gRPC (default port 4317) protocol
  • Configurable Push Interval: Control how frequently metrics are pushed to the collector

Example spicepod.yaml configuration for gRPC:

runtime:
telemetry:
enabled: true
otel_exporter:
endpoint: 'localhost:4317'
push_interval: '30s'

For more details, refer to the Observability & Monitoring Documentation.

S3 Connector Improvementsโ€‹

S3 Location Predicate Pruning: The S3 data connector now supports location-based predicate pruning, dramatically reducing data scanned by pushing down location filter predicates to S3 listing operations. For partitioned datasets (e.g., year=2025/month=12/), Spice now skips listing irrelevant partitions entirely, significantly reducing query latency and S3 API costs.

AWS S3 Tables Write Support: Full read/write capability for AWS S3 Tables, enabling direct integration with AWS's managed table format for S3. Use standard SQL INSERT INTO to write data.

For more details, refer to the S3 Data Connector Documentation and Glue Data Connector Documentation.

Faster Distributed Query Executionโ€‹

Distributed query planning and execution have been significantly improved:

  • Fixed executor registration in cluster mode for more reliable distributed deployments
  • Improved hostname resolution for Flight server binding, enabling better executor discovery
  • Distributed accelerator registration: Data accelerators now properly register in distributed mode
  • Optimized query planning: DistributeFileScanOptimizer improvements for faster planning with large datasets

For more details, refer to the Distributed Query Documentation.

Search Improvementsโ€‹

Search capabilities have been improved with several performance and reliability enhancements:

  • Fixed FTS query blocking: Full-text search queries no longer block unnecessarily, improving query responsiveness
  • Optimized vector index operations: Eliminated unnecessary list_vectors calls for better performance
  • Improved limit pushdown: IndexerExec now properly handles limit pushdown for more efficient searches

For more details, refer to the Search Documentation.

Security Hardeningโ€‹

Multiple security improvements have been implemented:

  • SQL Identifier Quoting: Hardened SQL identifier quoting across all database connectors (PostgreSQL, MySQL, DuckDB, etc.) to prevent SQL injection attacks through table or column names
  • Token Redaction: Sensitive authentication tokens are now fully redacted in debug and error output, preventing accidental credential exposure in logs
  • Path Traversal Prevention: Fixed tar extraction operations to prevent directory traversal vulnerabilities when processing archived files
  • Input Sanitization: Added strict validation for top_n_sample order_by clause parsing to prevent injection attacks
  • Glue Credential Handling: Prevented automatic loading of AWS credentials from environment in Glue connector, ensuring explicit credential configuration

Developer Experience Improvementsโ€‹

  • Health probe metrics: Added health probe latency metrics for better observability
  • CLI improvements: Fixed .clear history command in the REPL to fully clear persisted history

Contributorsโ€‹

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

No major cookbook updates.

The Spice Cookbook includes 82 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.10.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.0 image:

docker pull spiceai/spiceai:1.10.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

๐ŸŽ‰ Spice is now available in the AWS Marketplace!

What's Changedโ€‹

Changelogโ€‹

Spice v1.10.0-rc.1 (Dec 2, 2025)

ยท 11 min read
David Stancu
Principal Software Engineer at Spice AI

Announcing the release of Spice v1.10.0-rc.1! โšก

v1.10.0-rc1 is a release candidate for early testing of v1.10 features including an all new caching acceleration mode, tiny_lfu caching policy, a new DynamoDB Streams connector (Preview), improvements to the DynamoDB connector, faster distributed query execution, S3 connector improvements, and security hardening for v1.10.0-stable.

What's New in v1.10.0-rc1โ€‹

Caching Acceleration Mode with SWR and TinyLFUโ€‹

This release introduces a new caching acceleration mode that implements the stale-while-revalidate (SWR) pattern using Data Accelerators such as DuckDB or Cayenne, enabling queries to return file-persisted cached results immediately while asynchronously refreshing data in the background. Combined with the new TinyLFU cache eviction policy, Spice can now maintain higher cache hit rates while keeping memory usage predictable.

Key Features:

  • Stale-While-Revalidate (SWR): Returns cached data immediately while refreshing in the background
  • Data Accelerator Support: Cached accelerators can persist data to disk using DuckDB, SQLite, or Cayenne file modes.
  • TinyLFU Cache Policy: Probabilistic cache admission policy that maintains high hit rates with minimal overhead
  • Predictable Memory Usage: Configurable memory limits with automatic eviction of less frequently used entries

Example Spicepod.yml configuration:

runtime:
caching:
sql_results:
enabled: true
eviction_policy: tiny_lfu # default lru

datasets:
- from: s3://my-bucket/data.parquet
name: cached_data
acceleration:
enabled: true
engine: duckdb
mode: file # Persist cache to disk
refresh_mode: caching
refresh_check_interval: 10m

For more details, refer to the Data Acceleration Documentation and Caching Documentation.

DynamoDB Streams Data Connector in Previewโ€‹

DynamoDB Connector now integrates with DynamoDB Streams which enables real-time streaming with support for both table bootstrapping and continuous change data capture (CDC). This connector automatically detects changes in DynamoDB tables and streams them into Spice for real-time query, search, and LLM-inference.

Key Features:

  • Real-Time CDC: Automatically captures inserts, updates, and deletes from DynamoDB tables
  • Table Bootstrapping: Initial full table load before streaming changes

Example Spicepod.yml configuration:

datasets:
- from: dynamodb:my_table
name: orders_stream
acceleration:
enabled: true
refresh_mode: changes

For more details, refer to the DynamoDB Connector Documentation.

Cayenne Accelerator Enhancementsโ€‹

The Cayenne data accelerator now supports:

  • Sort Columns Configuration: Optimize inserts by pre-sorting data on specified columns for improved query performance

Example Spicepod.yml configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: sorted_data
acceleration:
enabled: true
engine: cayenne
mode: file_create
params:
sort_columns: timestamp,region

For more details, refer to the Cayenne Documentation.

S3 Connector Improvementsโ€‹

S3 Location Predicate Pruning: The S3 data connector now supports location-based predicate pruning, dramatically reducing data scanned by pushing down predicates to S3 listing operations. This optimization is especially effective for partitioned datasets stored in S3.

AWS S3 Tables Write Support: Full read/write capability for AWS S3 Tables, enabling fast integration with AWS's table format for S3.

For more details, refer to the S3 Tables Data Connector Documentation and Glue Data Connection Documentation.

Faster Distributed Query Executionโ€‹

Distributed query planning and execution have been significantly improved:

  • Fixed executor registration in cluster mode for more reliable distributed deployments
  • Improved hostname resolution for Flight server binding, enabling better executor discovery
  • Distributed accelerator registration: Data accelerators now properly register in distributed mode
  • Optimized query planning: DistributeFileScanOptimizer improvements for faster planning with large datasets

For more details, refer to the Distributed Query Documentation.

Search Improvementsโ€‹

Search capabilities have been improved with several performance and reliability enhancements:

  • Fixed FTS query blocking: Full-text search queries no longer block unnecessarily, improving query responsiveness
  • Optimized vector index operations: Eliminated unnecessary list_vectors calls for better performance
  • Improved limit pushdown: IndexerExec now properly handles limit pushdown for more efficient searches

For more details, refer to the Search Documentation.

Security Hardeningโ€‹

Multiple security improvements have been implemented:

  • SQL identifier quoting: Hardened SQL identifier quoting across all connectors to prevent injection attacks
  • Token redaction: Sensitive tokens are now fully redacted in debug output to prevent credential leakage
  • Path traversal prevention: Fixed tar extraction to prevent path traversal vulnerabilities
  • Input sanitization: Added validation for top_n_sample order_by parsing
  • Improved credential handling: Improved credential management in Glue connector

Developer Experience Improvementsโ€‹

  • Health probe metrics: Added health probe latency metrics for better observability
  • CLI improvements: Fixed .clear history command in the REPL to fully clear persisted history

Contributorsโ€‹

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

No major cookbook updates. The Spice Cookbook still offers 82+ recipes to help you prototype quickly.

Upgradingโ€‹

To try v1.10.0-rc1, use one of the following methods:

CLI:

spice upgrade --version 1.10.0-rc1

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.10.0-rc1 image:

docker pull spiceai/spiceai:1.10.0-rc1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.10.0-rc1

AWS Marketplace:

๐ŸŽ‰ Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹