Skip to content

feature(cluster): use control-connection query fallback for public address connections#14512

Merged
fruch merged 2 commits into
scylladb:masterfrom
fruch:use-dynamic-whitelist-policy
May 12, 2026
Merged

feature(cluster): use control-connection query fallback for public address connections#14512
fruch merged 2 commits into
scylladb:masterfrom
fruch:use-dynamic-whitelist-policy

Conversation

@fruch
Copy link
Copy Markdown
Contributor

@fruch fruch commented May 4, 2026

Summary

When IP_SSH_CONNECTIONS is set to public, the CQL driver cannot establish node pools because nodes are accessed via their public IPs, which differ from the broadcast RPC addresses advertised by the cluster. This causes NoHostAvailable errors.

This PR uses the new ControlConnectionQueryFallback.SkipPoolCreation option (added in python-driver PR #878, released in scylla-driver 3.29.10) to fall back to the control connection for queries when public addresses are in use.

What changed

  • sdcm/cluster.py: In _create_session, when IP_SSH_CONNECTIONS == "public", pass allow_control_connection_query_fallback=ControlConnectionQueryFallback.SkipPoolCreation to the ClusterDriver constructor. This tells the driver to skip node-pool creation entirely and route all application queries through the control connection.
  • pyproject.toml: Update scylla-driver from 3.29.9 to 3.29.10 (the first release containing ControlConnectionQueryFallback).

Previous approach (dropped)

The original implementation used DynamicWhiteListRoundRobinPolicy from an unmerged python-driver branch (PR #833), which required pinning scylla-driver to a git ref. That PR was not accepted. This rework uses the officially released and merged approach instead.

Backends Affected

  • AWS - public IP connections
  • GCE - public IP connections
  • Azure - public IP connections

Manual Testing Required

  • Public IP CQL connections: Requires cloud backend with IP_SSH_CONNECTIONS=public
    • Test: uv run sct.py run-test artifacts_test.ArtifactsTest.test_scylla_service --backend aws --config test-cases/artifacts/ami.yaml --config configurations/network_config/test_communication_public.yaml
    • Expected: CQL connections succeed via control-connection fallback

@fruch fruch requested a review from dkropachev May 4, 2026 18:37
@fruch fruch added New Hydra Version PR# introduces new Hydra version test-integration Enable running the integration tests suite ai-assisted backport/2026.2 and removed New Hydra Version PR# introduces new Hydra version labels May 11, 2026
fruch added a commit to fruch/scylla-cluster-tests that referenced this pull request May 11, 2026
…dress connections

Replace DynamicWhiteListRoundRobinPolicy (from unmerged python-driver PR scylladb#833) with
ControlConnectionQueryFallback.SkipPoolCreation (from merged python-driver PR scylladb#878,
available in scylla-driver 3.29.10). When IP_SSH_CONNECTIONS is set to 'public', the
driver falls back to the control connection for queries instead of trying to create
node pools that can't reach broadcast RPC addresses.

Drop run-ami-artifact-test.sh convenience script.

[scylladb#14512]
@fruch fruch force-pushed the use-dynamic-whitelist-policy branch from ecdc90f to 2986b1e Compare May 11, 2026 13:53
@fruch fruch changed the title feature(cluster): use DynamicWhiteListRoundRobinPolicy for public address connections feature(cluster): use control-connection query fallback for public address connections May 11, 2026
@fruch fruch added the New Hydra Version PR# introduces new Hydra version label May 11, 2026
fruch added a commit to fruch/scylla-cluster-tests that referenced this pull request May 11, 2026
…dress connections

When IP_SSH_CONNECTIONS is set to 'public', the CQL driver cannot establish
node pools because nodes are accessed via public IPs that differ from the
broadcast RPC addresses. Use ControlConnectionQueryFallback.SkipPoolCreation
(scylla-driver 3.29.10, python-driver PR scylladb#878) to route queries through the
control connection instead.

Update scylla-driver from 3.29.9 to 3.29.10.

[scylladb#14512]
@fruch fruch force-pushed the use-dynamic-whitelist-policy branch from 2986b1e to cf01e99 Compare May 11, 2026 13:56
@fruch fruch added New Hydra Version PR# introduces new Hydra version P1 Urgent and removed New Hydra Version PR# introduces new Hydra version labels May 11, 2026
@scylladb-promoter
Copy link
Copy Markdown
Collaborator

scylladb-promoter commented May 11, 2026

✅ Test Summary: PASSED

✅ Precommit: PASSED

Total Passed Failed Skipped
38 16 0 22

✅ Tests: PASSED

Total Passed Failed Errors Skipped
3264 3246 0 0 18

Full build log

@fruch fruch marked this pull request as ready for review May 11, 2026 20:17
@fruch fruch requested a review from soyacz as a code owner May 11, 2026 20:17
…dress connections

When IP_SSH_CONNECTIONS is set to 'public', the CQL driver cannot establish
node pools because nodes are accessed via public IPs that differ from the
broadcast RPC addresses. Use ControlConnectionQueryFallback.SkipPoolCreation
(scylla-driver 3.29.10, python-driver PR scylladb#878) to route queries through the
control connection instead.

Update scylla-driver from 3.29.9 to 3.29.10.

[scylladb#14512]
@fruch fruch force-pushed the use-dynamic-whitelist-policy branch from d8a31d1 to 70f563e Compare May 12, 2026 05:08
@soyacz
Copy link
Copy Markdown
Contributor

soyacz commented May 12, 2026

Copy link
Copy Markdown
Contributor

@soyacz soyacz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should use it in the artifact test (we use it in Azure tests: jenkins-pipelines/oss/artifacts/artifacts-azure-image.jenkinsfile)?

@fruch
Copy link
Copy Markdown
Contributor Author

fruch commented May 12, 2026

maybe we should use it in the artifact test (we use it in Azure tests: jenkins-pipelines/oss/artifacts/artifacts-azure-image.jenkinsfile)?

I've tested this locally, using public communication with AWS, should be the same.

@fruch fruch added New Hydra Version PR# introduces new Hydra version and removed New Hydra Version PR# introduces new Hydra version labels May 12, 2026
Copy link
Copy Markdown
Contributor

@soyacz soyacz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fruch fruch merged commit 1e95879 into scylladb:master May 12, 2026
5 checks passed
fruch added a commit that referenced this pull request May 12, 2026
…dress connections

When IP_SSH_CONNECTIONS is set to 'public', the CQL driver cannot establish
node pools because nodes are accessed via public IPs that differ from the
broadcast RPC addresses. Use ControlConnectionQueryFallback.SkipPoolCreation
(scylla-driver 3.29.10, python-driver PR #878) to route queries through the
control connection instead.

Update scylla-driver from 3.29.9 to 3.29.10.

[#14512]
scylladbbot pushed a commit to scylladbbot/scylla-cluster-tests that referenced this pull request May 12, 2026
…dress connections

When IP_SSH_CONNECTIONS is set to 'public', the CQL driver cannot establish
node pools because nodes are accessed via public IPs that differ from the
broadcast RPC addresses. Use ControlConnectionQueryFallback.SkipPoolCreation
(scylla-driver 3.29.10, python-driver PR scylladb#878) to route queries through the
control connection instead.

Update scylla-driver from 3.29.9 to 3.29.10.

[scylladb#14512]

(cherry picked from commit 0d6eb56)
fruch added a commit that referenced this pull request May 12, 2026
…dress connections

When IP_SSH_CONNECTIONS is set to 'public', the CQL driver cannot establish
node pools because nodes are accessed via public IPs that differ from the
broadcast RPC addresses. Use ControlConnectionQueryFallback.SkipPoolCreation
(scylla-driver 3.29.10, python-driver PR #878) to route queries through the
control connection instead.

Update scylla-driver from 3.29.9 to 3.29.10.

[#14512]

(cherry picked from commit 0d6eb56)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-assisted backport/2026.2-done New Hydra Version PR# introduces new Hydra version P1 Urgent promoted-to-master test-integration Enable running the integration tests suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants