[11:56:22] hashar: hey :) [11:56:37] so https://integration.wikimedia.org/ci/job/debian-glue/620/ failed and https://integration.wikimedia.org/ci/job/debian-glue/621/ passed [11:56:41] any idea why? [12:01:28] I see that the failing build happened on integration-slave-jessie-1001, the successful one on integration-slave-jessie-1002 [12:06:21] oh and also the failing build called cowbuilder --dumpconfig, the good one went straight to --build [12:23:28] eeeeek [12:23:36] ema: will check that after lunch [12:23:44] been talking to volans about debian glue jobs [12:23:53] guess there is a glitch somewhere [12:24:03] maybe even a live hack on one of the jenkins instance ( which is terrible) [12:45:43] ema: different cowbuilder versions :( [12:57:04] downgraded cowbuilder [12:57:12] rebuild the failing patch on the same instance [12:57:13] and it pass [12:57:14] es [13:52:05] ok [14:50:42] moritzm: openssl 1.1.0e upgraded on cp2002 and cp4008, unless something goes wrong we can carry on with other cache hosts too [14:50:57] ok! [17:19:32] so we had a bunch of 500s earlier on https://grafana.wikimedia.org/dashboard/db/varnish-aggregate-client-status-codes?panelId=2&fullscreen&var-site=ulsfo&var-cache_type=text&var-status_type=5&from=1487604564888&to=1487607424828 [17:20:17] grepping around 5xx.json, they all happened on cp4008, which is one of the two machines I've upgraded to the latest openssl earlier on today [17:20:36] however, the requests also all come from the same IP so it might just be a coincidence [17:23:04] uri_path is the interesting part of all those errors: "/https://www.wikipedia.org/https://www.wikipedia.org/https://www.wikipedia.org [...]" [17:23:36] and so on, with len(uri_path) == 2048 [17:25:24] for the record, the first one came in at 2017-02-20T15:45:50 and the last one at 2017-02-20T15:59:42 [17:29:15] tomorrow I'll upgrade another cache_text machine to the latest openssl and see if this was indeed just a coincidence [19:06:41] 10Traffic, 06Operations, 07Mobile: Samsung Internet's desktop mode getting redirected to mobile site - https://phabricator.wikimedia.org/T158599#3041524 (10MaxSem) [19:38:48] 10Traffic, 06Operations, 07Mobile: Samsung Internet's desktop mode getting redirected to mobile site - https://phabricator.wikimedia.org/T158599#3041524 (10revi) Interesting, my Samsung Galaxy A7 (2016)'s bundled Samsung Internet correctly handles 'request desktop version'. Maybe it's for Google Play version... [20:36:56] 10netops, 10Continuous-Integration-Infrastructure, 06Operations: jsduck publish error: index-pack died of signal 15 - https://phabricator.wikimedia.org/T158601#3041635 (10hashar) The jenkins jobs triggered by Zuul clones the repo from the zuul-merger instances on contint1001 / contint2001. They are being ser... [20:37:38] 10netops, 10Continuous-Integration-Infrastructure, 06Operations: git clone over EQIAD (wmflabs) CODFW timeout due to low bandwidth (~250 KiB/s) - https://phabricator.wikimedia.org/T158601#3041638 (10hashar) [20:40:18] 10netops, 10Continuous-Integration-Infrastructure, 06Labs, 06Operations: git clone over EQIAD (wmflabs) CODFW timeout due to low bandwidth (~250 KiB/s) - https://phabricator.wikimedia.org/T158601#3041644 (10Paladox) [20:41:09] 10netops, 10Continuous-Integration-Infrastructure, 06Labs, 06Operations: git clone over EQIAD (wmflabs) CODFW timeout due to low bandwidth (~250 KiB/s) - https://phabricator.wikimedia.org/T158601#3041581 (10Paladox) Is this happening to any other repos? Should we set this as normal or high priority?