Skip to content

Airflow 3.x API based connector#26624

Merged
TeddyCr merged 37 commits intomainfrom
airflow_3
Mar 26, 2026
Merged

Airflow 3.x API based connector#26624
TeddyCr merged 37 commits intomainfrom
airflow_3

Conversation

@harshach
Copy link
Copy Markdown
Collaborator

@harshach harshach commented Mar 20, 2026

Describe your changes:

Collate PR: https://github.com/open-metadata/openmetadata-collate/pull/3343

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Airflow 3.x API Connector:
    • New AirflowRestApiConnectionClassConverter for REST API authentication with Airflow connections
    • Added client.py, auth.py, models.py modules for Airflow API interactions
    • Implemented MWAA (Managed Workflows for Apache Airflow) authentication config support
    • Added service account and token-based auth for Google Composer managed Airflow
  • OpenLineage Integration:
    • Enhanced OpenLineageEntityResolver with improved entity resolution logic (+142,-28)
    • New OpenLineage lineage test DAGs and integration tests for connector validation
    • Updated schema models for openLineageFacets.json and connection configurations
  • Tests:
    • Added 484+ lines integration and 1070+ lines unit tests for Airflow API connector, MWAA client, and lineage resolution

This will update automatically on new commits.

@harshach harshach requested review from a team as code owners March 20, 2026 03:57
Copilot AI review requested due to automatic review settings March 20, 2026 03:57
@github-actions github-actions bot added backend safe to test Add this label to run secure Github workflows on PRs labels Mar 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an Airflow 3.x REST API–based pipeline connector (AirflowApi) across spec, UI, service test-connection definitions, and ingestion (client/source + tests), enabling Airflow metadata extraction without relying on the Airflow metadata DB connector.

Changes:

  • Introduces new PipelineServiceType value AirflowApi and wires it into schema selection + service icon handling in the UI.
  • Adds a new connection JSON schema (airflowApiConnection.json) and updates pipelineService.json to include it.
  • Implements the ingestion connector (AirflowApiClient + AirflowApiSource) with unit tests and E2E integration tests (plus sample DAGs), and adds service-side test-connection step definitions.

Reviewed changes

Copilot reviewed 15 out of 32 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
openmetadata-ui/src/main/resources/ui/src/utils/ServiceUtilClassBase.ts Maps airflowapi service type to the Airflow icon.
openmetadata-ui/src/main/resources/ui/src/utils/PipelineServiceUtils.ts Adds UI connection-schema selection for PipelineServiceType.AirflowApi.
openmetadata-spec/src/main/resources/json/schema/entity/services/pipelineService.json Adds AirflowApi to the service type enum and includes its connection schema in oneOf.
openmetadata-spec/src/main/resources/json/schema/entity/services/connections/pipeline/airflowApiConnection.json Defines the new Airflow API connection config schema (host/auth/version/etc.).
openmetadata-service/src/main/resources/json/data/testConnections/pipeline/airflowApi.json Adds backend test-connection step definitions for AirflowApi.
ingestion/src/metadata/ingestion/source/pipeline/airflowapi/service_spec.py Registers the new ingestion source via ServiceSpec.
ingestion/src/metadata/ingestion/source/pipeline/airflowapi/models.py Adds Pydantic models for Airflow REST API responses.
ingestion/src/metadata/ingestion/source/pipeline/airflowapi/client.py Implements the REST client with pagination + API version detection.
ingestion/src/metadata/ingestion/source/pipeline/airflowapi/connection.py Implements connection creation and test-connection step mapping.
ingestion/src/metadata/ingestion/source/pipeline/airflowapi/metadata.py Implements pipeline + task + status extraction via REST API.
ingestion/src/metadata/ingestion/source/pipeline/airflowapi/init.py Package init for the new connector module.
ingestion/tests/unit/topology/pipeline/test_airflowapi.py Unit tests for status mapping, models, client behavior, pagination, and URL generation.
ingestion/tests/integration/airflow/test_airflowapi_connector.py E2E test covering service creation, pipeline/task ingestion, status ingestion, and basic OpenLineage endpoint validation.
ingestion/tests/integration/airflow/test_dags/sample_etl.py Sample DAG for E2E ingestion and task graph validation.
ingestion/tests/integration/airflow/test_dags/sample_branching.py Sample branching DAG to validate parallel task structures.
ingestion/tests/integration/airflow/test_dags/lineage_etl.py Sample DAG emitting lineage via Airflow’s OpenLineage support for E2E lineage scenarios.

Comment on lines +215 to +222
timestamp = datetime_to_ts(dag_run.execution_date)
pipeline_status = PipelineStatus(
executionId=dag_run.dag_run_id,
taskStatus=task_statuses,
executionStatus=STATUS_MAP.get(
dag_run.state, StatusType.Pending.value
),
timestamp=Timestamp(timestamp) if timestamp else None,
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PipelineStatus requires timestamp (and executionStatus) per the Pipeline schema. Setting timestamp to None will cause model validation errors and can prevent status ingestion. Consider skipping DAG runs without a resolvable execution timestamp, or falling back to another available datetime field (e.g., start_date/end_date) so timestamp is always populated.

Suggested change
timestamp = datetime_to_ts(dag_run.execution_date)
pipeline_status = PipelineStatus(
executionId=dag_run.dag_run_id,
taskStatus=task_statuses,
executionStatus=STATUS_MAP.get(
dag_run.state, StatusType.Pending.value
),
timestamp=Timestamp(timestamp) if timestamp else None,
# Resolve a non-null timestamp for the pipeline run, falling back to
# alternative datetime fields if execution_date is not available.
timestamp = datetime_to_ts(dag_run.execution_date)
if timestamp is None:
dag_run_start_date = getattr(dag_run, "start_date", None)
if dag_run_start_date:
timestamp = datetime_to_ts(dag_run_start_date)
if timestamp is None:
dag_run_end_date = getattr(dag_run, "end_date", None)
if dag_run_end_date:
timestamp = datetime_to_ts(dag_run_end_date)
if timestamp is None:
logger.debug(
"Skipping DAG run %s for DAG %s as no execution timestamp "
"could be resolved from execution_date, start_date or end_date.",
dag_run.dag_run_id,
pipeline_details.dag_id,
)
continue
pipeline_status = PipelineStatus(
executionId=dag_run.dag_run_id,
taskStatus=task_statuses,
executionStatus=STATUS_MAP.get(
dag_run.state, StatusType.Pending.value
),
timestamp=Timestamp(timestamp),

Copilot uses AI. Check for mistakes.
Comment on lines +153 to +156
tags.append(tag.get("name", ""))
else:
tags.append(str(tag))

Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When normalizing DAG tags, tag.get("name", "") can append an empty string if the API returns an unexpected tag object. That can later cause invalid/empty tag labels to be created. Consider filtering out falsy/empty tag names when building the tags list (and/or skipping non-string/non-dict entries).

Suggested change
tags.append(tag.get("name", ""))
else:
tags.append(str(tag))
name = tag.get("name")
elif isinstance(tag, str):
name = tag
else:
# Skip unexpected tag types to avoid invalid/empty labels
continue
if not name:
# Skip falsy/empty tag names
continue
tags.append(str(name))

Copilot uses AI. Check for mistakes.
"title": "Host And Port",
"description": "URL to the Airflow REST API. E.g., http://localhost:8080",
"type": "string",
"format": "uri"
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most pipeline connection schemas mark hostPort as "expose": true (e.g., Airflow/Airbyte/Nifi). Missing this flag here can make hostPort behave inconsistently with other services in tooling/UI that relies on expose. Consider adding "expose": true under hostPort for consistency.

Suggested change
"format": "uri"
"format": "uri",
"expose": true

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 20, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (38)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.12.7 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.13.4 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.15.2 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (15)

Package Vulnerability ID Severity Installed Version Fixed Version
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6, 2.11.1
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.5 3.1.8
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 20, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
libpam-modules CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-modules-bin CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-runtime CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam0g CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (39)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.12.7 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.13.4 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.15.2 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.16.1 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
Authlib CVE-2026-27962 🔥 CRITICAL 1.6.6 1.6.9
Authlib CVE-2026-28490 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28498 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28802 🚨 HIGH 1.6.6 1.6.7
PyJWT CVE-2026-32597 🚨 HIGH 2.10.1 2.12.0
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
aiohttp CVE-2025-69223 🚨 HIGH 3.13.2 3.13.3
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6, 2.11.1
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.5 3.1.8
apache-airflow-providers-http CVE-2025-69219 🚨 HIGH 5.6.0 6.0.0
azure-core CVE-2026-21226 🚨 HIGH 1.37.0 1.38.0
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
google-cloud-aiplatform CVE-2026-2472 🚨 HIGH 1.130.0 1.131.0
google-cloud-aiplatform CVE-2026-2473 🚨 HIGH 1.130.0 1.133.0
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5, 5.29.6
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
pyasn1 CVE-2026-23490 🚨 HIGH 0.6.1 0.6.2
pyasn1 CVE-2026-30922 🚨 HIGH 0.6.1 0.6.3
python-multipart CVE-2026-24486 🚨 HIGH 0.0.20 0.0.22
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
tornado CVE-2026-31958 🚨 HIGH 6.5.3 6.5.5
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: usr/bin/docker

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
stdlib CVE-2025-68121 🔥 CRITICAL v1.25.5 1.24.13, 1.25.7, 1.26.0-rc.3
stdlib CVE-2025-61726 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2025-61728 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2026-25679 🚨 HIGH v1.25.5 1.25.8, 1.26.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 20, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 64%
64.85% (58126/89627) 44.67% (30722/68764) 47.69% (9199/19289)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 20, 2026

🟡 Playwright Results — all passed (21 flaky)

✅ 3393 passed · ❌ 0 failed · 🟡 21 flaky · ⏭️ 217 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 450 0 5 2
🟡 Shard 2 601 0 2 32
🟡 Shard 3 605 0 3 28
🟡 Shard 4 597 0 6 47
✅ Shard 5 587 0 0 67
🟡 Shard 6 553 0 5 41
🟡 21 flaky test(s) (passed on retry)
  • Features/DataAssetRulesDisabled.spec.ts › should allow multiple domain selection for glossary term when entity rules are disabled (shard 1, 1 retry)
  • Features/CustomizeDetailPage.spec.ts › Stored Procedure - customization should work (shard 1, 1 retry)
  • Flow/Tour.spec.ts › Tour should work from welcome screen (shard 1, 1 retry)
  • Flow/Tour.spec.ts › Tour should work from URL directly (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/DataQuality/AddTestCaseNewFlow.spec.ts › Add Column Test Case (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/SchemaTable.spec.ts › schema table test (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for Database Schema (shard 4, 1 retry)
  • Pages/DomainDataProductsRightPanel.spec.ts › Should display overview tab for data product (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Data Product announcement create, edit & delete (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Comprehensive domain rename with ALL relationships preserved (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Tag Add, Update and Remove (shard 4, 2 retries)
  • Pages/HyperlinkCustomProperty.spec.ts › should display URL when no display text is provided (shard 6, 1 retry)
  • Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)
  • VersionPages/EntityVersionPages.spec.ts › Directory (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Copilot AI review requested due to automatic review settings March 20, 2026 09:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 56 changed files in this pull request and generated 1 comment.

Comment on lines +422 to +424
} catch (Exception e) {
LOG.debug("Error searching for container by fullPath {}: {}", fullPath, e.getMessage());
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This debug log drops the exception stack trace by logging only e.getMessage(), which makes diagnosing container-resolution issues harder. Log the exception itself (or include the stack trace) when debug is enabled so failures can be investigated.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 58 changed files in this pull request and generated 4 comments.

Comment on lines +37 to +48

OM_HOST = "http://localhost:8585"
OM_API = f"{OM_HOST}/api"
OM_JWT = (
"eyJraWQiOiJHYjM4OWEtOWY3Ni1nZGpzLWE5MmotMDI0MmJrOTQzNTYiLCJ0eXAiOiJKV1QiLCJhbGci"
"OiJSUzI1NiJ9.eyJzdWIiOiJhZG1pbiIsImlzQm90IjpmYWxzZSwiaXNzIjoib3Blbi1tZXRhZGF0YS5vcm"
"ciLCJpYXQiOjE2NjM5Mzg0NjIsImVtYWlsIjoiYWRtaW5Ab3Blbm1ldGFkYXRhLm9yZyJ9.tS8um_5DKu7"
"HgzGBzS1VTA5uUjKWOCU0B_j08WXBiEC0mr0zNREkqVfwFDD-d24HlNEbrqioLsBuFRiwIWKc1m_ZlVQbG7"
"P36RUxhuv2vbSp80FKyNM-Tj93FDzq91jsyNmsQhyNv_fNr3TXfzzSPjHt8Go0FMMP66weoKMgW2PbXlhVK"
"wEuXUHyakLLzewm9UMeQaEiRzhiTMU3UkLXcKbYEJJvfNFcLwSl9W8JCO_l0Yj3ud-qt_nQYEZwqW6u5nfd"
"QllN133iikV4fM5QZsMCnm8Rq1mvLR0y9bmJiD7fwM1tmJ791TUWqmKaTnP49U493VanKpUAfzIiOiIbhg"
)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test duplicates a long hardcoded admin JWT. The repo already centralizes test JWTs via _openmetadata_testutils.ometa.OM_JWT; using that (or reading from an env var) avoids duplication and reduces the chance of stale credentials causing test failures.

Suggested change
OM_HOST = "http://localhost:8585"
OM_API = f"{OM_HOST}/api"
OM_JWT = (
"eyJraWQiOiJHYjM4OWEtOWY3Ni1nZGpzLWE5MmotMDI0MmJrOTQzNTYiLCJ0eXAiOiJKV1QiLCJhbGci"
"OiJSUzI1NiJ9.eyJzdWIiOiJhZG1pbiIsImlzQm90IjpmYWxzZSwiaXNzIjoib3Blbi1tZXRhZGF0YS5vcm"
"ciLCJpYXQiOjE2NjM5Mzg0NjIsImVtYWlsIjoiYWRtaW5Ab3Blbm1ldGFkYXRhLm9yZyJ9.tS8um_5DKu7"
"HgzGBzS1VTA5uUjKWOCU0B_j08WXBiEC0mr0zNREkqVfwFDD-d24HlNEbrqioLsBuFRiwIWKc1m_ZlVQbG7"
"P36RUxhuv2vbSp80FKyNM-Tj93FDzq91jsyNmsQhyNv_fNr3TXfzzSPjHt8Go0FMMP66weoKMgW2PbXlhVK"
"wEuXUHyakLLzewm9UMeQaEiRzhiTMU3UkLXcKbYEJJvfNFcLwSl9W8JCO_l0Yj3ud-qt_nQYEZwqW6u5nfd"
"QllN133iikV4fM5QZsMCnm8Rq1mvLR0y9bmJiD7fwM1tmJ791TUWqmKaTnP49U493VanKpUAfzIiOiIbhg"
)
from _openmetadata_testutils.ometa import OM_JWT
OM_HOST = "http://localhost:8585"
OM_API = f"{OM_HOST}/api"

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

sonika-shah
sonika-shah previously approved these changes Mar 26, 2026
@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Mar 26, 2026

Code Review 👍 Approved with suggestions 7 resolved / 8 findings

Airflow 3.x API connector adds support for the latest Airflow version with comprehensive test coverage and multiple bug fixes including auth error handling, JWT exchange robustness, and MWAA compatibility. Consider sanitizing the namespace parameter in the fallback FQN construction to prevent potential injection issues.

💡 Edge Case: Fallback FQN uses unsanitized namespace from OpenLineage event

📄 openmetadata-service/src/main/java/org/openmetadata/service/openlineage/OpenLineageEntityResolver.java:160-174

The new fallback at line 162 constructs namespace + "." + name directly from OpenLineage event data without sanitization, unlike the primary path which uses replaceAll("[^a-zA-Z0-9_-]", "_"). While getEntityReferenceByName will simply throw EntityNotFoundException for invalid FQNs (so this isn't exploitable), the comment example fasfas.stackoverflow_etl_lineage suggests the namespace is expected to be a clean service name. A namespace containing dots (e.g., airflow.prod.us-east) would produce an FQN with extra segments that could accidentally match a different entity.

✅ 7 resolved
Edge Case: Auto-detection silently masks auth errors as version fallback

📄 ingestion/src/metadata/ingestion/source/pipeline/airflowapi/client.py:82 📄 ingestion/src/metadata/ingestion/source/pipeline/airflowapi/client.py:45
_detect_api_version() catches all exceptions including HTTP 401/403 (invalid credentials). If a user misconfigures their token or username/password, the method silently falls back to v1 instead of surfacing the authentication failure. This makes credential problems very hard to diagnose — the user sees mysterious downstream failures rather than a clear auth error.

The underlying REST client raises requests.exceptions.HTTPError with response.status_code available, so auth errors can be distinguished from genuine 404s.

Note: this follows the broad except Exception pattern used by other pipeline connectors (including the sibling airflow connector), but it's worth fixing here since the fallback behavior actively masks the root cause.

Quality: Typo in Java class name: AirflowRestAPiConnection (capital P)

📄 openmetadata-spec/src/main/resources/json/schema/entity/utils/airflowRestApiConnection.json:7 📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/converter/AirflowConnectionClassConverter.java:22 📄 openmetadata-service/src/main/java/org/openmetadata/service/secrets/converter/AirflowConnectionClassConverter.java:35
The JSON schema sets javaType to AirflowRestAPiConnection (with uppercase 'P' in 'APi') instead of the conventional AirflowRestApiConnection. This inconsistency propagates to all generated Java code and the converter class. While it works, it makes the class harder to find and looks like a typo.

Edge Case: _try_exchange_jwt swallows connection errors, falls back silently

📄 ingestion/src/metadata/ingestion/source/pipeline/airflowapi/client.py:40 📄 ingestion/src/metadata/ingestion/source/pipeline/airflowapi/client.py:116 📄 ingestion/src/metadata/ingestion/source/pipeline/airflow/connection.py:232-236
_try_exchange_jwt catches all exceptions with a bare except Exception (line 51), including ConnectionError, TimeoutError, and DNS failures. If the Airflow host is unreachable or temporarily down during the JWT exchange attempt, the function silently returns None, causing the constructor to fall back to Basic auth. Subsequent API calls to the same unreachable host will then fail with a confusing connection error rather than failing fast during initialization.

This is inconsistent with _detect_api_version (lines 116-121), which correctly re-raises ConnectionError, TimeoutError, and OSError so connectivity problems surface immediately.

The fix is to let network-level exceptions propagate, only catching HTTP errors (like 404 for Airflow 2.x that doesn't have /auth/token).

Bug: Double _parse_response call on already-parsed responses

📄 ingestion/src/metadata/ingestion/source/pipeline/airflow/api/client.py:236-237 📄 ingestion/src/metadata/ingestion/source/pipeline/airflow/api/client.py:271-272
get_dag_tasks() (line 167-170) already calls self._parse_response(response) and returns a dict. Then in get_dag_detail() at line 237, the result is parsed again via self._parse_response(task_response). The same pattern occurs with list_dag_runs() (line 173-177) which already parses, but get_dag_runs() at line 272 parses again.

This is currently harmless because _parse_response falls through to return response for dicts (no .json attr), but it's misleading and fragile — if _parse_response behavior ever changes (e.g., wrapping results), this would break silently.

Quality: Test helper docstring claims AccessToken auth but uses no auth

📄 ingestion/tests/unit/topology/pipeline/test_airflowapi.py:38-44
In _make_client(), auth_config is a MagicMock, which will fail all isinstance checks in AirflowApiClient.__init__ (lines 61-73 of client.py), causing auth_token_fn = None (the else branch). The docstring says 'using AccessToken auth' but the client is actually created with no authentication. This doesn't break tests because TrackedREST is fully mocked, but the docstring is misleading. If the intent is to exercise the AccessToken path in the constructor, the mock should be a real AccessToken instance (as done correctly in test_airflow_connection.py).

...and 2 more resolved from earlier reviews

🤖 Prompt for agents
Code Review: Airflow 3.x API connector adds support for the latest Airflow version with comprehensive test coverage and multiple bug fixes including auth error handling, JWT exchange robustness, and MWAA compatibility. Consider sanitizing the namespace parameter in the fallback FQN construction to prevent potential injection issues.

1. 💡 Edge Case: Fallback FQN uses unsanitized namespace from OpenLineage event
   Files: openmetadata-service/src/main/java/org/openmetadata/service/openlineage/OpenLineageEntityResolver.java:160-174

   The new fallback at line 162 constructs `namespace + "." + name` directly from OpenLineage event data without sanitization, unlike the primary path which uses `replaceAll("[^a-zA-Z0-9_-]", "_")`. While `getEntityReferenceByName` will simply throw `EntityNotFoundException` for invalid FQNs (so this isn't exploitable), the comment example `fasfas.stackoverflow_etl_lineage` suggests the namespace is expected to be a clean service name. A namespace containing dots (e.g., `airflow.prod.us-east`) would produce an FQN with extra segments that could accidentally match a different entity.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

@keshavmohta09
Copy link
Copy Markdown
Member

Changes have been cherry-picked to the 1.12.4 branch 57280af

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Airflow 3 Rest Connection Ingestion

10 participants