[00:21:09] 10Tool-Labs-tools-Xtools, 10Addwiki, 03Community-Tech-Sprint: MediawikiApi::newFromPage() fails on non-XML HTML - https://phabricator.wikimedia.org/T163527#3219833 (10MusikAnimal) Maybe I'm doing something wrong, but on the latest mediawiki-api and mediawiki-api-core, with `MediawikiApi::newFromPage('https:/... [00:42:51] PROBLEM - Puppet errors on tools-exec-1431 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [0.0] [01:04:56] 06Labs, 10DBA, 10MediaWiki-General-or-Unknown: MW database: user.user_editcount shows a wrong value - https://phabricator.wikimedia.org/T134359#3219865 (10Krinkle) [01:17:52] RECOVERY - Puppet errors on tools-exec-1431 is OK: OK: Less than 1.00% above the threshold [0.0] [02:05:20] 10Tool-Labs-tools-Xtools, 10Addwiki, 03Community-Tech-Sprint: MediawikiApi::newFromPage() fails on non-XML HTML - https://phabricator.wikimedia.org/T163527#3219894 (10Samwilson) 05Resolved>03Open Duplicate element IDs break DOMDocument::loadHTML (in the case above, https://sq.wikinews.org/wiki/MediaWiki:... [02:46:32] 10Tool-Labs-tools-Xtools, 10Addwiki, 03Community-Tech-Sprint: MediawikiApi::newFromPage() fails on non-XML HTML - https://phabricator.wikimedia.org/T163527#3219903 (10Samwilson) PR is here: https://github.com/addwiki/mediawiki-api-base/pull/35 [08:55:27] 06Labs, 10Tool-Labs: Upgrade bootstrap-vz version for tools docker builder - https://phabricator.wikimedia.org/T157526#3220160 (10MoritzMuehlenhoff) a:03MoritzMuehlenhoff [08:59:00] 06Labs, 10Tool-Labs: Upgrade bootstrap-vz version for tools docker builder - https://phabricator.wikimedia.org/T157526#3220167 (10MoritzMuehlenhoff) That's caused by a change in "docker import --change" no longer supporting LABEL as a Dockerfile instruction: https://docs.docker.com/engine/reference/commandline... [10:39:54] 06Labs, 10DBA: Prepare and check storage layer for wbwikimedia - https://phabricator.wikimedia.org/T162513#3220405 (10jcrespo) @chasemp @Andrew This has been redacted on sanitarium (you have to be blocked by me first doing that before touching new wikis), it is stil pending on sanitarium2. Hold for now. [10:49:05] guys, you are aware than replication stopped (or slowed down a lot) on s3 32 hours ago and s6 12 hours ago? [11:12:11] it didn't stopped [11:12:14] *stop [11:12:20] s3 is under maintenance [11:12:25] and so are many other shards [11:12:37] they are applying important schema changes [11:14:59] I probably missed the notice about the maintenance [11:18:27] you know you can use for s3 the new beta servers, right? [11:24:56] actually, it doesn't matter, they are less lagged, but lagged anyway [11:56:30] Trying to run a database query on the enwiki replica db, but I keep losing connection to MySQL server during query (even when submitting with jsub). [11:59:30] Query is: select page_title, MIN(rev_timestamp) as rev_timestamp, page_namespace from page, revision where rev_page = page_id and (page_namespace=0 or page_namespace=1) group by rev_page [12:00:06] ; [12:09:10] Trying to find the creation time of every page? [12:19:04] yeryry: Yes, of every page in the article and talk namespaces. [12:21:28] yeryry: Tried creating a custom my.cnf file with max_allowed_packet set to 4096M, but that's over 2147483647K so fails. [12:22:04] https://www.mediawiki.org/wiki/Manual:Revision_table#rev_parent_id it looks like that is 0 for new page creations, so you may be able to use that rather than min? [12:23:53] Yeryry: rev_id is unreliable, see https://en.wikipedia.org/wiki/User:Graham87/Page_history_observations#Revision_ID_numbers [12:24:31] rev_parent_id, not rev_id [12:25:30] And I guess if there somehow is more than one with 0 for one article, you could filter those out from the output afterwards [12:28:40] 10Wikibugs: Wikibugs bot for Phabricator currently down? - https://phabricator.wikimedia.org/T163754#3220574 (10Zppix) 05Open>03Resolved a:03Zppix Closing as it appears its "fixed" [12:28:55] 10Wikibugs: Wikibugs bot for Phabricator currently down? - https://phabricator.wikimedia.org/T163754#3220577 (10Zppix) a:05Zppix>03None [12:29:15] yeryry: Yes… but that doesn't catch imported edits. [12:32:48] Ah... [12:34:19] Well, your query seems to run quickly when I give it a limit, so presumably it's just the number of results that is a problem... Not sure if there's some built in batching system you could use... What are you connecting from/ [12:35:00] yeryry SSH to login.tools.wmflabs.org [12:35:15] So the commandline client? [12:36:04] yeryry Yes, running sql enwiki from the shell after logging into wmflabs over ssh. [12:36:48] I think that only displays the results once the query is complete... Not sure if there's a way to avoid that, but obviously that's not going to be useful for so many results [12:37:21] what's your query? [12:37:42] Creation time for every article/article talkpage... [12:38:02] yeryry: Well, I'm using jsub in a wrapper script which outputs to a file thanks to jsub... [12:39:22] That just runs the client as a job, I think? Which won't make much difference... [12:42:31] Any ideas? [12:46:50] id select_type table type possible_keys key key_len ref rows Extra [12:46:52] 1 SIMPLE page index PRIMARY, name_title name_title 261 None 34811465 Using where; Using index; Using temporary; Using filesort [12:46:52] 1 SIMPLE revision ref PRIMARY, page_timestamp PRIMARY 4 enwiki.page.page_id 14 [12:50:15] zhuyifei1999_: Huh? [12:50:34] i.e. super inefficient [12:51:03] mainly "Using temporary; Using filesort" [12:51:14] 06Labs, 10Tool-Labs, 06Operations: Upgrade bootstrap-vz version for tools docker builder - https://phabricator.wikimedia.org/T157526#3220620 (10MoritzMuehlenhoff) [12:51:33] filesort is from MIN call I believe [12:52:02] temporary is from the group by [12:52:39] I'll suggest using rev_parent_id as well [12:53:58] 06Labs, 10Tool-Labs, 06Operations, 13Patch-For-Review: Upgrade bootstrap-vz version for tools docker builder - https://phabricator.wikimedia.org/T157526#3220625 (10MoritzMuehlenhoff) That's actually already fixed by the bootstrap-vz version in stretch, I've backported it for jessie-wikimedia along with bac... [13:02:09] zhuyifei1999_: Alright, will do and filter out false positives by hand (there should be no more than a few hundred). [13:08:55] select page_title, rev_timestamp, page_namespace from page, revision where rev_page = page_id and rev_parent_id = 0 and page_namespace in (0, 1); [13:09:08] id select_type table type possible_keys key key_len ref rows Extra [13:09:09] 1 SIMPLE page index PRIMARY, name_title name_title 261 None 34811559 Using where; Using index [13:09:09] 1 SIMPLE revision ref PRIMARY, page_timestamp PRIMARY 4 enwiki.page.page_id 14 Using where [13:10:26] will scan 34811557 rows "using where" so not super efficient either [13:14:03] How do you get those details? [13:14:34] https://tools.wmflabs.org/tools-info/optimizer.py [13:14:42] ^ is the easy way [13:15:13] or you could do https://phabricator.wikimedia.org/T50875#2845764 [13:15:21] thanks zhuyifei1999_ for the tips [13:15:52] we need more people helping like you :-) [13:16:29] Ah, last time I tried that page it was dead.. Or maybe an old link that needed to be updated [13:16:42] yeah, it broke for some time [13:19:00] jynus: thanks :) [13:48:19] PROBLEM - Puppet errors on tools-docker-builder-05 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [0.0] [15:32:16] 10Tool-Labs-tools-Other, 15User-bd808: Add domain info to openstack-browser - https://phabricator.wikimedia.org/T164085#3221194 (10Andrew) [16:26:11] 10Tool-Labs-tools-Xtools, 06Community-Tech: Resolve timeout errors for Adminscore - https://phabricator.wikimedia.org/T164094#3221435 (10Matthewrbowker) [16:36:21] 10Labs-project-Wikistats: wikistats: add new wikipedias: kbp, khw, dty and pt.wikimedia - https://phabricator.wikimedia.org/T160947#3115930 (10Dereckson) You can do pt.wikimedia too, but the other two are stalled. [16:54:14] 10Tool-Labs-tools-Other, 13Patch-For-Review, 15User-bd808: Add domain info to openstack-browser - https://phabricator.wikimedia.org/T164085#3221539 (10Andrew) [17:24:43] Hello,Amitie_10g here [17:28:27] I want to enable Memcached class in PHP for my tool (the grid and kubernetes), but the information found at Wikitech and Phabricator is specific only to Mediawiki installations. Before requesting anything to Phabricator, does anyone attemped to use Memcached for your own scripts? [17:32:17] Amitie_10g: we do not have memcached in tool labs, but we do have a shared redis server [17:32:49] Amitie_10g: https://wikitech.wikimedia.org/wiki/Help:Tool_Labs#Redis [17:33:18] Thanks [18:03:04] 10Tool-Labs-tools-Xtools: Convert all xtools issues to either Phabricator or GitHub - https://phabricator.wikimedia.org/T134632#3221708 (10Matthewrbowker) [18:03:08] 10Tool-Labs-tools-Xtools, 06Community-Tech: Epic: Rewriting XTools - https://phabricator.wikimedia.org/T153112#3221709 (10Matthewrbowker) [18:03:17] RECOVERY - Puppet errors on tools-docker-builder-05 is OK: OK: Less than 1.00% above the threshold [0.0] [18:03:51] 10Tool-Labs-tools-Xtools: Convert all xtools issues to either Phabricator or GitHub - https://phabricator.wikimedia.org/T134632#2271790 (10Matthewrbowker) @Samwilson @MusikAnimal I'd like to unstall this task if possible. Can we possibly start working through the issues on GitHub and begin handling them? Any t... [18:19:42] Hello, trying to run an SQL query, but getting an error stating the query was interrupted. Was on here earlier this morning my time and we optimized the query (I was losing connection to the database server). [18:20:00] I'm using jsub on Wikimedia Labs, writing a wrapper script to run my query. [18:20:18] select page_title, rev_timestamp, page_namespace from page, revision where rev_parent_id=0 and rev_page = page_id and (page_namespace=0 or page_namespace=1); [18:31:21] 06Labs, 10MediaWiki-extensions-OpenStackManager, 10Tool-Labs: The future of service groups and service users on Labs - https://phabricator.wikimedia.org/T162945#3180968 (10chasemp) https://lists.wikimedia.org/pipermail/labs-announce/2017-April/000227.html https://lists.wikimedia.org/pipermail/labs-l/2017-... [18:42:17] 10Tool-Labs-tools-Xtools, 06Community-Tech: Resolve timeout errors for Adminscore - https://phabricator.wikimedia.org/T164094#3221781 (10Superyetkin) Where is the codebase located? Any chance to examine the SQL query? [18:45:46] 06Labs: Generate labsdb views for dtywiki, pawikisource, ptwikimedia, wbwikimedia - https://phabricator.wikimedia.org/T164103#3221787 (10jcrespo) [18:48:23] 06Labs: Generate labsdb views for dtywiki, pawikisource, ptwikimedia, wbwikimedia - https://phabricator.wikimedia.org/T164103#3221787 (10jcrespo) [18:48:25] 06Labs, 10DBA, 13Patch-For-Review: Prepare and check storage layer for pa.wikisource - https://phabricator.wikimedia.org/T160859#3221813 (10jcrespo) [18:49:02] 06Labs: Generate labsdb views for dtywiki, pawikisource, ptwikimedia, wbwikimedia - https://phabricator.wikimedia.org/T164103#3221816 (10jcrespo) [18:49:04] 06Labs, 10DBA: Prepare and check storage layer for dty.wikipedia.org - https://phabricator.wikimedia.org/T162102#3221818 (10jcrespo) [18:49:33] 10Tool-Labs-tools-Xtools, 06Community-Tech: Resolve timeout errors for Adminscore - https://phabricator.wikimedia.org/T164094#3221821 (10Matthewrbowker) rXTR which is duplicated from https://github.com/x-tools/xtools-rebirth The file you're looking for is https://github.com/x-tools/xtools-rebirth/blob/master/... [18:49:36] 06Labs, 10DBA: Prepare and check storage layer for wbwikimedia - https://phabricator.wikimedia.org/T162513#3221826 (10jcrespo) [18:57:46] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: XTools: Top edits - 'All' namespaces option - https://phabricator.wikimedia.org/T160721#3221838 (10Stevietheman) Is this available for user review (beta) anywhere? [18:59:13] 06Labs, 05Security: Generate labsdb views for dtywiki, pawikisource, ptwikimedia, wbwikimedia - https://phabricator.wikimedia.org/T164103#3221839 (10chasemp) hey #security folks can we can a sign off on creating the normal views for these four wiki's on the labs DB replicas? [19:04:15] codeofdusk: Sorry your query is getting killed by the timeout limit. I don't think there is any way you are going to be able to run a select like that over all of enwiki in one go. It's just way too much data to be handled by the db server. [19:06:04] Your best bet would be to break it up into manageable chunks. Something like 10k pages at a time and one namespace at a time should complete in a reasonable amount of time. [19:06:40] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: XTools: Top edits - 'All' namespaces option - https://phabricator.wikimedia.org/T160721#3221879 (10MusikAnimal) >>! In T160721#3221838, @Stevietheman wrote: > Is this available for user review (beta) anywhere? I don't think the "All" option is working yet, bu... [19:06:57] 10Tool-Labs-tools-Xtools: Convert all xtools issues to either Phabricator or GitHub - https://phabricator.wikimedia.org/T134632#3221880 (10MusikAnimal) >>! In T134632#3221708, @Matthewrbowker wrote: > @Samwilson @MusikAnimal I'd like to unstall this task if possible. Can we possibly start working through the is... [19:08:27] bd808: How do you suggest I split it up? [19:09:44] page_id would probably be the logical field to partition on [19:10:54] bd808: So what's the syntax look like? [19:11:24] something like "and page_id => X and page_id <= Y" in the where clause [19:12:57] the revisions table for enwiki is huge, so anything you can do to narrow down how much of it you are scanning will help [19:13:34] out of curiosity, what are you going to use the resulting list for? [19:14:27] you are going to end up with ~7M titles and timestamps if you ever get it to work [19:16:14] bd808: Writing an http://enwp.org/extended_essay on enenwp page history. [19:16:52] bd808: How big should each partition be? [19:17:46] codeofdusk: you may actually want to talk to our analytics group either on their #wikimedia-analytics irc channel or on their mailing list (https://lists.wikimedia.org/mailman/listinfo/analytics) [19:18:10] they have better ways to collect huge datasets than just raw mysql queries [19:19:14] BD808: Do you think I could collect 1000000 at a time? [19:19:44] you can try. I honestly don't know what it will take to fit in the time limit [19:20:30] if 1M is still too slow I'd binary search it by cutting the interval in half each attempt [19:26:35] bd808: Yeah good idea. [19:27:44] BD808: My query now: http://paste.ubuntu.com/24474877/ [19:28:33] yup. git it a try [19:28:36] *give [19:29:38] BD808: There are duplicate header rows but nothing my post-processor can't handle. [19:29:43] BD808: will be* [19:45:49] 10Tool-Labs-tools-Pageviews: Ability to process pages that are an instance of a Wikidata item - https://phabricator.wikimedia.org/T164113#3222065 (10MusikAnimal) [20:18:54] 06Labs, 05Security: Generate labsdb views for dtywiki, pawikisource, ptwikimedia, wbwikimedia - https://phabricator.wikimedia.org/T164103#3222180 (10chasemp) Friendly ping for @Bawolff and @dpatrick. I'm not sure if these are approved somewhere else in this capacity or not but I'm trying to error on the side... [20:35:23] I get a lot of "Unable to initialize environment because of error: denied: host "tools-webgrid-lighttpd-1427.tools.eqiad.wmflabs" is neither submit nor admin host" when using jsub [20:46:24] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: XTools: Top edits - 'All' namespaces option - https://phabricator.wikimedia.org/T160721#3222252 (10Stevietheman) Thanks. The secret stops here. heh [20:49:55] 10Tool-Labs-tools-Xtools, 06Community-Tech: [Epic] Rewrite XTools: Top edits - https://phabricator.wikimedia.org/T160137#3222261 (10Stevietheman) I was just taking a look at this again, and wondered why the entries don't have counting numbers (1-100). Is there any particular reason why this can't be there as... [20:51:27] 10Tool-Labs-tools-Xtools, 06Community-Tech: [Epic] Rewrite XTools: Top edits - https://phabricator.wikimedia.org/T160137#3222266 (10Stevietheman) On second thought, I guess it would be a little tricky as some entries' number of edits would be tied with others. [20:55:36] 06Labs, 10Labs-Infrastructure: Audit disk usage on labvirts - https://phabricator.wikimedia.org/T163796#3222276 (10Andrew) That spreadsheet is also available as https://docs.google.com/spreadsheets/d/1TRimo0kT_YzlXl_RD3Z7zOZHdj5Piev31ALIKku7Y8g [20:56:02] Is there a Labs sysop? I got 504 Gateway Time-out when entering to my Webservice [20:57:15] Even when I stopped the Webservice [20:57:38] And I'm using kubernetes [21:01:10] Amitie_10g: link? [21:01:21] https://tools.wmflabs.org/webarchivebot/ [21:01:36] Amitie_10g: has it worked in the past? [21:01:56] It worked few minutes ago [21:02:27] Amitie_10g: when you issue kubctl get pods do you see something with name of tool in the list? [21:03:04] Let me see [21:04:11] I'm unable to see my service [21:04:43] No response [21:06:04] Amitie_10g: try running webservice --backend=kubernetes start [21:06:37] Staring [21:07:01] It returned [21:07:01] it works for me now Amitie_10g [21:07:08] Yes [21:07:35] Amitie_10g: from your error.log -- "2017-04-28 20:47:52: (mod_fastcgi.c.2702) FastCGI-stderr: PHP Parse error: syntax error, unexpected '.' in /data/project/webarchivebot/git/WebArchiveBOT/public_html/template.php on line 31" [21:07:38] Amitie_10g: my guess is you ran webservice stop and forgot to start it [21:07:49] Maybe [21:08:30] oh fancy docs you got there Amitie_10g :) [21:08:35] I already fixed the code error [21:09:21] *nod* the error.log and service.log if there is one are the first things to check when things aren't running as expected [21:11:25] bd808: when they issued the start command for webservice it worked fine i think it was a case of forgetting to start the service backup after stopping it [21:16:52] Cyberpower678: are you around? [21:17:05] Zppix: kind of [21:17:43] Cyberpower678: your bot when it warns about afd tag removal it puts "peachy" in the section header like in https://en.wikipedia.org/w/index.php?title=User_talk%3ASportsAndBusiness&type=revision&diff=777723062&oldid=777716096 [21:18:02] Right. I need to fix that at some point. [21:18:14] Cyberpower678: okay just giving you a heads up [21:18:19] ill fix it manually right now [21:39:48] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Pages created interface gives error for most projects in new XTools - https://phabricator.wikimedia.org/T163634#3222438 (10MusikAnimal) a:03MusikAnimal [21:46:42] Another question [21:49:45] Is possible to deploy HHVM in Proxygen mode as standalone webserver in Kubernetes (or at least the grid) for a tool (serving in the port 80)? [21:51:49] Or may be possible only in FastCGI mode? [22:19:04] 06Labs, 06Operations: tools-k8s-master-01 has two floating IPs - https://phabricator.wikimedia.org/T164123#3222598 (10chasemp) [22:19:16] 06Labs, 06Operations: tools-k8s-master-01 has two floating IPs - https://phabricator.wikimedia.org/T164123#3222613 (10chasemp) p:05Triage>03Normal a:03chasemp [22:20:25] Amitie_10g: we would probably need to build a new Docker container to setup hhvm either way on the kubernetes cluster [22:20:57] mmm [22:21:27] there may be a way to run it on the job grid using the `webservice generic` stuff -- https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Web#Other_.2F_generic_web_servers [22:22:01] are you expecting the built-in http server to be better than fcgi somehow? [22:22:19] for what its worth we use hhvm in fcgi mode in production [22:23:37] Amitie_10g: After you were asking about hhvm before I started thinking a bit about making an hhvm flavor for the kubernetes clsuter but I haven't really done anything for that yet [22:23:44] I already seen the Generic webserver and I attempted to setup HHVM, and also I attempted tos etup HHVM as FastCGI withy lighttpd [22:23:55] Good [22:24:12] the best bet would be for us to make a proper Docker container for it [22:24:22] that would make it easier to configure hhvm [22:24:38] Seems a good idea [22:25:22] I attempted to create a rule for lighttpd to use HHVM as FastCGI, replacing php5.6, but I got problems with the sockets [22:25:40] I got working once but I lost that config [22:25:52] 10Tool-Labs-tools-Xtools, 03Community-Tech-Sprint: Pages created interface gives error for most projects in new XTools - https://phabricator.wikimedia.org/T163634#3222635 (10MusikAnimal) So while the one bug for T163527 needed to be fixed, there was still more ways to improve getting project metadata within XT... [22:26:25] Amitie_10g: would you like to make a phabricator task for getting a proper hhvm setup for kubernetes? I think I can carve out some time to work on that next week [22:26:51] Yes, I'll do that [22:29:03] If you have ideas about the config settings that would be useful please put them in the task. We'd probably want to try running in the proxygen mode so we don't need lighttpd in the container too [22:29:22] and maybe turn on the PHP7 compat flag? [22:31:37] the official docker image has a pretty simple config -- https://github.com/hhvm/hhvm-docker/blob/master/hhvm-latest-proxygen/server.ini [22:35:38] The problem is proxygen uses only ports and don't supports socket (that a shared infraestructure wants) [22:38:48] 10Tool-Labs-tools-Other, 13Patch-For-Review, 15User-bd808: Add domain info to openstack-browser - https://phabricator.wikimedia.org/T164085#3222683 (10bd808) 05Open>03Resolved DNS information is now listed on the project page if there are records for that particular project. See https://tools.wmflabs.org... [22:39:38] The setting I tried is invoke HHVM in FastCGI mode with one single socket (hhvm.sock.webarchivebot). When I displayed a web page, the logs displayed that lighttpd attempted to open a socket that does not exist (hhvm.sock.webarchivebot-1) [22:41:04] And I don't know how can HHVM create more than one socket (rather tan launching two or more instances of HHVM with their correspondient sockets), unlike PHP that can be invoked launching more than one socket (as I know) [22:41:16] Amitie_10g: in the kubernetes setup having a port would be the right thing. The nginx proxy that acts as the router for tools.wmflabs.org used a lookup table to find the host:port to reverse proxy. [22:41:42] I'll poke at it and see if I can make something that works [23:02:30] 10Tool-Labs-tools-Xtools, 06Community-Tech: Add missing meta.sql file to xtools-rebirth repo - https://phabricator.wikimedia.org/T164127#3222737 (10kaldari) [23:25:21] I got working HHVM in FastCGI by invoking the following in the following [23:25:37] "/usr/bin/hhvm --mode daemon -d hhvm.server.type=fastcgi -d hhvm.server.file_socket=/var/run/lighttpd/hhvm.socket.webarchivebot -d hhvm.server.source_root=/data/project/webarchivebot/public_html -d error_log=/data/project/webarchivebot/hhcv_error.log -d hhvm.log.file=/data/project/webarchivebot/hhvm_log.log"