8000 Disable failing ci builders by kiburtse · Pull Request #7206 · realm/realm-core · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Disable failing ci builders #7206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 79 additions & 45 deletions evergreen/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1215,6 +1215,22 @@ task_groups:
- .test_suite
- .object_store_test_suite

# compile and run only object store tests: local and remote
- name: compile_test_object_store
max_hosts: 1
setup_group_can_fail_task: true
setup_group:
- func: "fetch source"
- func: "fetch binaries"
teardown_task:
- func: "upload test results"
- func: "upload baas artifacts"
timeout:
- func: "run hang analyzer"
tasks:
- compile
- .object_store_test_suite

# Runs object-store-tests against baas running on remote host and runs
# the network simulation tests as a separate task for nightly builds
- name: network_tests
Expand Down Expand Up @@ -1465,50 +1481,51 @@ buildvariants:
tasks:
- name: fuzzer-tests

- name: ubuntu2004-network-nonideal
display_name: "Ubuntu 20.04 x86_64 (Utunbu2004 - nonideal transfer)"
run_on: ubuntu2004-large
expansions:
clang_url: "https://s3.amazonaws.com/static.realm.io/evergreen-assets/clang%2Bllvm-11.0.0-x86_64-linux-gnu-ubuntu-20.04.tar.xz"
cmake_url: "https://s3.amazonaws.com/static.realm.io/evergreen-assets/cmake-3.20.3-linux-x86_64.tar.gz"
cmake_bindir: "./cmake_binaries/bin"
fetch_missing_dependencies: On
c_compiler: "./clang_binaries/bin/clang"
cxx_compiler: "./clang_binaries/bin/clang++"
cmake_build_type: RelWithDebInfo
run_with_encryption: On
baas_admin_port: 9098
test_logging_level: debug
test_timeout_extra: 60
proxy_toxics_file: evergreen/proxy-nonideal-transfer.toxics
# RANDOM1: bandwidth-upstream limited to between 10-50 KB/s from the client to the server
# RANDOM2: bandwidth-downstream limited to between 10-50 KB/s from the server to the client
proxy_toxics_randoms: "10:50|10:50"
tasks:
- name: network_tests

- name: ubuntu2004-network-faulty
display_name: "Ubuntu 20.04 x86_64 (Utunbu2004 - network faults)"
run_on: ubuntu2004-large
expansions:
clang_url: "https://s3.amazonaws.com/static.realm.io/evergreen-assets/clang%2Bllvm-11.0.0-x86_64-linux-gnu-ubuntu-20.04.tar.xz"
cmake_url: "https://s3.amazonaws.com/static.realm.io/evergreen-assets/cmake-3.20.3-linux-x86_64.tar.gz"
cmake_bindir: "./cmake_binaries/bin"
fetch_missing_dependencies: On
c_compiler: "./clang_binaries/bin/clang"
cxx_compiler: "./clang_binaries/bin/clang++"
cmake_build_type: RelWithDebInfo
run_with_encryption: On
baas_admin_port: 9098
test_logging_level: debug
proxy_toxics_file: evergreen/proxy-network-faults.toxics
# RANDOM1: limit-data-upstream to close connection after between 1000-3000 bytes have been sent
# RANDOM2: limit-data-downstream to close connection after between 1000-3000 bytes have been received
# RANDOM3: slow-close-upstream to keep connection to server open after 1000-1500 milliseconds after being closed
# RANDOM4: reset-peer-upstream after 50-200 seconds to force close the connection to the server
proxy_toxics_randoms: "1000:3000|1000:3000|1000:1500|50:200"
tasks:
- name: network_tests
# disable these builders since there are constantly failing and not yet ready for nightly builds
# - name: ubuntu2004-network-nonideal
# display_name: "Ubuntu 20.04 x86_64 (Utunbu2004 - nonideal transfer)"
# run_on: ubuntu2004-large
# expansions:
# clang_url: "https://s3.amazonaws.com/static.realm.io/evergreen-assets/clang%2Bllvm-11.0.0-x86_64-linux-gnu-ubuntu-20.04.tar.xz"
# cmake_url: "https://s3.amazonaws.com/static.realm.io/evergreen-assets/cmake-3.20.3-linux-x86_64.tar.gz"
# cmake_bindir: "./cmake_binaries/bin"
# fetch_missing_dependencies: On
# c_compiler: "./clang_binaries/bin/clang"
# cxx_compiler: "./clang_binaries/bin/clang++"
# cmake_build_type: RelWithDebInfo
# run_with_encryption: On
# baas_admin_port: 9098
# test_logging_level: debug
# test_timeout_extra: 60
# proxy_toxics_file: evergreen/proxy-nonideal-transfer.toxics
# # RANDOM1: bandwidth-upstream limited to between 10-50 KB/s from the client to the server
# # RANDOM2: bandwidth-downstream limited to between 10-50 KB/s from the server to the client
# proxy_toxics_randoms: "10:50|10:50"
# tasks:
# - name: network_tests
#
# - name: ubuntu2004-network-faulty
# display_name: "Ubuntu 20.04 x86_64 (Utunbu2004 - network faults)"
# run_on: ubuntu2004-large
# expansions:
# clang_url: "https://s3.amazonaws.com/static.realm.io/evergreen-assets/clang%2Bllvm-11.0.0-x86_64-linux-gnu-ubuntu-20.04.tar.xz"
# cmake_url: "https://s3.amazonaws.com/static.realm.io/evergreen-assets/cmake-3.20.3-linux-x86_64.tar.gz"
# cmake_bindir: "./cmake_binaries/bin"
# fetch_missing_dependencies: On
# c_compiler: "./clang_binaries/bin/clang"
# cxx_compiler: "./clang_binaries/bin/clang++"
# cmake_build_type: RelWithDebInfo
# run_with_encryption: On
# baas_admin_port: 9098
# test_logging_level: debug
# proxy_toxics_file: evergreen/proxy-network-faults.toxics
# # RANDOM1: limit-data-upstream to close connection after between 1000-3000 bytes have been sent
# # RANDOM2: limit-data-downstream to close connection after between 1000-3000 bytes have been received
# # RANDOM3: slow-close-upstream to keep connection to server open after 1000-1500 milliseconds after being closed
# # RANDOM4: reset-peer-upstream after 50-200 seconds to force close the connection to the server
# proxy_toxics_randoms: "1000:3000|1000:3000|1000:1500|50:200"
# tasks:
# - name: network_tests

- name: rhel70
display_name: "RHEL 7 x86_64"
Expand Down Expand Up @@ -1546,10 +1563,25 @@ buildvariants:
python3: "/opt/mongodbtoolchain/v3/bin/python3"
use_system_openssl: On
fetch_missing_dependencies: On
cmake_build_type: RelWithDebInfo
tasks:
- name: compile_test_and_package
- name: benchmarks

- name: ubuntu2204-arm64-asan
display_name: "Ubuntu 22.04 ARM64 (ASAN)"
run_on: ubuntu2204-arm64-large
expansions:
cmake_url: "https://s3.amazonaws.com/static.realm.io/evergreen-assets/cmake-3.20.3-linux-aarch64.tar.gz"
cmake_bindir: "./cmake_binaries/bin"
python3: "/opt/mongodbtoolchain/v3/bin/python3"
use_system_openssl: On
fetch_missing_dependencies: On
cmake_build_type: Debug
enable_asan: On
tasks:
- name: compile_test

Comment on lines +1571 to +1584
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you got ubuntu2204-arm64-asan to pass by suppressing the messages around adjtime - nice!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not quite, this is asan, not tsan. The problem with adjtime was a new one after #6911. It also fails builds from time to time. TSAN on ubuntu on arm64 still needs a lot of fixes. The test build on ubuntu2204-arm64-large showed a few hundred issues, i didn't yet look into them apart from linked open ones.

- name: macos
display_name: "MacOS 11.0 x86_64"
run_on: macos-1100
Expand Down Expand Up @@ -1695,7 +1727,9 @@ buildvariants:
cmake_build_type: RelWithDebInfo
enable_tsan: On
tasks:
- name: compile_test
# FIXME: tsan is not stable on arm64, fails often with internal errors
# - name: compile_test
Comment on lines +1730 to +1731
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this for macOS as well? I thought it was only for ubuntu2204...

Copy link
Contributor Author
@kiburtse kiburtse Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for #7185 - this failure is presumably core_tests and sync_tests specific only on macos 11 arm64. I haven't seen this failures myself on macos 13 arm64, so i hope if we move this builder on macos1300-arm64 then we can reenable whole test suite.

ubuntu on arm64 reports different issues but consistently and for every test suite we have from what i can tell

- name: compile_test_object_store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if the sync-tests pass on this platform?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they do, but way too often they fail #7185


- name: macos-coverage
display_name: "MacOS 11 arm64 (Code Coverage)"
Expand Down
5 changes: 4 additions & 1 deletion test/tsan.suppress
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,11 @@ race:realm::util::EncryptedFileMapping::copy_up_to_date_page

# Avoid a false positive instance of lock-order-inversion.
# SyncManager::m_sessions_mutex and SyncSession::m_state_mutex are locked
# in this order when a SyncSession is created, and in reverse order when
# in this order when a SyncSession is created, and in reverse order when
# SyncSession::become_inactive is called. Creating a SyncSession and becoming
# inactive cannot happen at the same time.
deadlock:realm::sync::MigrationStore::create_sentinel_subscription_set
deadlock:realm::sync::MigrationStore::create_subscriptions

# mktime, timegm, gmtime modify global time zone env var, but the race is harmless
race:adjtime
0