[01:10:51] 6Labs, 10MediaWiki-Vagrant: Wikimedia SMTP server does not work with Labs-Vagrant - https://phabricator.wikimedia.org/T117391#1791629 (10yuvipanda) [01:50:01] 10Quarry: Query counter increases but draft query is not accessible when window is closed and query doesn't have a title - https://phabricator.wikimedia.org/T101394#1791638 (10XXN) A similar problem. There are some queries without title and placeholder 'None' does not appear instead and i can't access query even... [09:50:58] 6Labs, 10Tool-Labs, 7Database: Missing rows in revision table of enwiki.labsdb (data integrity issue) - https://phabricator.wikimedia.org/T118095#1791802 (10jcrespo) [09:51:00] 6Labs, 7Database: Lots of rows are missing from enwiki_p.`revision` - https://phabricator.wikimedia.org/T115207#1791803 (10jcrespo) [10:23:34] 6Labs, 10Tool-Labs, 7Database: Missing rows in revision table of enwiki.labsdb (data integrity issue) - https://phabricator.wikimedia.org/T118095#1791817 (10jcrespo) Despite being a mixture of old an recent pages, all missing edits seem to be from around the same dates. This would point to a range of transac... [10:23:52] 6Labs, 7Database: Lots of rows are missing from enwiki_p.`revision` - https://phabricator.wikimedia.org/T115207#1791819 (10jcrespo) Despite being a mixture of old an recent pages, all missing edits seem to be from around the same dates. This would point to a range of transactions missing. I will identify the e... [10:46:58] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/StudiesWorld was created, changed by StudiesWorld link https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Access_Request/StudiesWorld edit summary: Created page with "{{Tools Access Request |Justification=I plan to use the Tools project to experiment with analyses of the Wikipedia database. |Completed=false |User Name=StudiesWorld }}" [14:10:05] I'm running a bot (Kenrick95Bot) on Tool Labs and it got logged out every 30 days; is there any configuration to make the bot automatically logged in every 30 days? Thanks. [14:11:18] kenrick95: pywikibot? [14:11:31] yup [14:11:43] it's a welcome bot [14:11:46] kenrick95: see step 5 in https://wikitech.wikimedia.org/wiki/User:Russell_Blau/Using_pywikibot_on_Labs [14:12:18] nice, thanks a lot :) [18:01:14] any tool has url.rewrite-once set so that I can take a look and see what am I doing wrong [18:01:18] as for me [18:01:24] url.rewrite-once += ("^/schlagbäume/([_a-zA-Z]*)" => "/schlagbäume.php?wiki=$1") [18:01:27] seem to work not [18:02:20] (e.g. get 404 on http://tools.wmflabs.org/basebot/schlagbäume/be_x_oldwiki ) [18:05:32] labs is not reachable within labs iirc [18:05:48] you have to use the internal name [18:06:40] so https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Web#Example_configurations is another wrong piece of information on the page [18:06:42] or does it rewrite it wrong? [18:06:53] what is the internal name? [18:06:58] i don't know [18:07:08] but i guess what i just said it not relevant to your case [18:07:45] It does not seem to rewrite at all. But I am not sure as I have actually never dealt with these all configs and stuff [18:08:18] just thought that it's good to have some aliases and follow the help page [18:56:19] Base: the url doesn't start with /schlagbäume but with /basebot/schlagbäume [18:58:21] and you might need to urlescape it [18:59:43] adding /basebot didn't help [19:00:02] ä you mean? [19:00:41] %C3%A4 [19:01:01] debug.log-request-handling = "enable" might also come into use [19:01:21] what will it do? [19:01:35] log it into some file I guess. which one? [19:01:38] add debug logging to request handling [19:01:39] error.log [19:04:51] works with escaping, thanks. though I have to guess how to make css work this way. damn but weren't we in the Unicode era? crap.. [19:06:39] I'm not sure how this is related to css [19:06:59] just I have css connected by relative path in code [19:07:04] as for 'the unicode era'; this matches whatever the client sends without further interpretation [19:07:18] and non-ascii characters are sent as urlencoded utf-8 octets [19:08:03] :/ [19:46:37] PROBLEM - Host tools-andrew-puppettest is DOWN: CRITICAL - Host Unreachable (10.68.21.109) [21:06:24] Why is SELECT rev_timestamp FROM revision WHERE rev_user_text="Dispenser" LIMIT 1; taking forever? [21:07:56] Dispenser: where are you selecting from? [21:08:05] wait [21:08:18] you should be using revision_user_index [21:09:28] revision_userindex* [21:15:33] https://phabricator.wikimedia.org/T68786 Considering I can't run a query for more than 8 hours without somehow getting discounted we may as well do this [21:22:45] performance still sucks [21:25:06] Change on 12wikitech.wikimedia.org a page Nova Resource:Tools/Access Request/StudiesWorld was modified, changed by Tim Landscheidt link https://wikitech.wikimedia.org/w/index.php?diff=201070 edit summary: [21:27:02] Dispenser: it all depends on how your running the query, most of mine are very reasonable run times [21:28:07] Is there a list of which tables have which indexes? [21:30:35] Dispenser: they use the WMF table structure except for the _userindex tables which have added rows [21:30:53] * indexes [21:31:17] 6Labs, 10wikitech.wikimedia.org: "Edit with form" missing on a Tools access request page - https://phabricator.wikimedia.org/T118136#1792342 (10scfc) 3NEW [21:32:01] Dispenser: https://wikitech.wikimedia.org/wiki/Help:Tool_Labs/Database#Tables_for_revision_or_logging_queries_involving_user_names_and_IDs [21:32:45] (but I agree 'it's the same as on mediawiki.org except when it's not' is not a very useful statement when it comes to database indices...) [21:35:03] Like I'm not sure if I should be using revision_userindex when its match to rev_id, but the optimizer might want to instead use rev_user_text [21:36:41] as in: you want a list of revisions by user X newer than Y? [21:37:51] [Edit to find] and [Edit by this user] [21:38:03] ? [21:38:29] Similar, but not a range query [21:38:43] if you know the rev_id, why would you need to select on rev_user_text? [21:38:56] because then there is only a single result [21:42:31] Because apparently I'm joining it into the page table and something about the space and underscore swap has me worrying about performance [21:43:28] uuuh. [21:44:09] Dispenser: what are you doing? [21:44:15] nvm, its too narrow a of case [21:45:30] Dispenser: if you can use a query where you have the rev_id either table should be just as fast, but the _userindex table may not contain the info if its been rev_del'ed [21:46:00] also keeping the joins and overall query simple helps [22:13:22] Hi [22:13:25] I have a request [22:13:37] How do I run BIG/long queries on Quarry? [22:13:45] Currently it kills stuff that takes over 20 mins... [22:13:53] which is typically the big queries I do dialy [22:14:12] Is there a way to mark them as slow so the server doesn't overload (killing the query)? [22:15:05] ShakespeareFan00: hi [22:15:14] unfortunately no, quarry does kill things at 20min mark [22:15:20] how long do you think the queries you run are going to be? [22:15:51] It depends but they seem to on past experience be around the 30 to 60 min mark... [22:16:07] They are queries looking for lists of images [22:16:19] so I guess you need to get a toollabs account, create a tool and run them from there [22:16:20] which takes time because it has to cross reference a LOT of data [22:16:32] I can probably increase the Quarry time limit to 30min, but probably not to 1h [22:16:38] Or bother someone to write it [22:16:54] What I a, trying to do with most of them is finding "missing" stuff [22:16:58] why is its icon a bulldozer? [22:17:43] gifti: hay designed it at the London wikimania :) [22:17:49] and I guess bulldozers are used in Quarries.. [22:17:56] https://quarry.wmflabs.org/query/5647 for example is looking for missing information blocks which has been rather useful in reducing the number of files with "No machine readable source", "no machine readable description etc... [22:18:06] oh, a pun [22:18:18] my goodness [22:18:21] :) [22:18:30] It also alsows you to bulldoze thorugh a LOT of data quickly? [22:18:42] ShakespeareFan00: I'm going to increase the time limit to 30min now [22:18:51] i prefer the direct sql access … [22:19:13] Certainly makes it a lot easier when doing the sort of things the WP:DBR page used to do :) [22:19:19] Like finding Orphaned Non-free files [22:19:22] I need to make Quarry do templates properly [22:19:27] then you can output them in clickable form [22:19:42] and also make it do Crons properly, so people can run them daily/weekly [22:19:46] Can you also figure out ways to speed up some queries? [22:19:55] unfortunately know, that requires SQL knowledge [22:19:59] which I don't have :( [22:20:12] or more hardware, which I don't have budget for [22:20:20] I meant things like using cached data sources and so on [22:20:52] If you can find someone to ask about it, some of things I am doing are possibly things that could also be done with stored procedures... [22:20:54] IIRC [22:21:10] (And yes you would need an SQL person to explain what those are) [22:21:22] yeah [22:21:58] ShakespeareFan00: I got http://mediawiki.org/wiki/Talk:Quarry made into a flow board [22:22:03] so people can post quarry questions there [22:22:16] It's a shame that the lack of metadata problem isn't something that can be engineered out of the Database/Mediawiki [22:22:53] ShakespeareFan00: me and halfak have been talking a bit about that [22:23:35] Possibly by enforcing (at least at English Wikipedia) an 'information' block at upload [22:23:47] (people will of course still add junkdata... but... [22:23:58] so should I increase Quarry time limit to 30min and see what happens? [22:24:11] Ask other users as well [22:24:21] I might be the only one running hughe queries [22:24:33] And as I said, I'd like to see a "SLOW" option [22:24:58] which runs the lengthier queries at a low priority [22:25:06] even if they will take longer [22:25:17] than the current 30 mins [22:25:40] yeah, I don't think labsdb supports that unfortunately [22:25:43] so there isn't much I can do at quarry [22:25:55] I can do something like kill SLOW queries based on load on labsdb [22:26:03] but I'd rather just get more hardware [22:26:40] Why doesn't labsdb have SLOW? [22:27:03] I know my queries are big... so SLOW er/ lower priority seems reasonable [22:27:13] you need something that'll selectively kill queries [22:27:21] labsdb just doesn't have a query killer [22:27:37] so if you do it directly, it won't kill your queries explicitly [22:27:53] but Quarry is a much easier to use system than direct SQL and so has limits [22:27:59] labsdb itself has no limits [22:36:00] ShakespeareFan00: if you need a DBR let me know [22:36:20] Check out Sfan00's queries on Quarry... [22:36:27] Lot of interesting queries there [22:36:29] :) [22:36:35] Mostly to do with missing metadata [22:36:58] ShakespeareFan00: if you have specific requests I can whip those up, but Im not going to look for random tasks [22:37:05] Noted [22:37:21] Trying to figure out a way to log for images with no license tag at present [22:38:36] ShakespeareFan00: thats easy [22:38:51] You would have thought so... [22:38:57] ShakespeareFan00: all images should be in one of two categories [22:39:02] One of three [22:39:10] ShakespeareFan00: two [22:39:29] And then you have to filter out all the file description pages that are DYK / FP tags for files at Commons [22:39:45] Betacommand: It's three... Free, Non-Free and Free in US [22:39:54] Free means it can move to Commons... [22:40:25] ShakespeareFan00: just cross reference with the image table vs page table. [22:40:28] Free in US means it's free in the US and thus locally doesn't need an NFUR but can't be moved Commons [22:40:36] Non-free is what needs an NFUR [22:40:42] So it's Three categories [22:41:22] ShakespeareFan00 that should be a subcategory of "All Free media" [22:41:33] I would disagree [22:41:44] Because it's NOT free outside the US [22:41:57] Meaning it can't be moved to Commons [22:42:35] ShakespeareFan00: "All Free media" is the inverse of what [[WP:NFC]] is [22:42:58] ShakespeareFan00: If it requres a NFUR then its non-fee. [22:43:07] Not dipsuting that [22:43:17] Otherwise its free, there are degrees of freeness. [22:43:26] but its still classified as free media [22:43:29] I still feel there should be a distinction between Free and Free in some places [22:43:41] Betacommand: I will have to disagree here [22:44:00] because of the project goal of 'free' stuff [22:44:04] (ie Libre) [22:44:15] ShakespeareFan00: with regards to enwp policy there are only two types of media [22:44:37] enwp policy has shifted in my view [22:44:58] Ill be back in a bit, if you want something specific let me know [22:46:15] Any admins about? [22:46:28] I'd like to amend a template I wrote a while back [22:46:34] but it's currently protected [22:49:34] I've put the new version in the sandbox [22:49:45] Any admins awake? [22:49:49] admin for what project, ShakespeareFan00? labs? :) [22:49:55] enwikipedia [22:50:11] https://en.wikipedia.org/wiki/Template:Non-free_biog-pic [22:50:20] And to be fair the categorisation should also be updated [22:50:26] to something more appropriate [22:50:47] wrong channel, probably :) [22:54:43] Oh /... [22:54:45] Bright red [22:54:49] Apologies all [23:12:03] valhallasw`cloud: I added stats for our proxies [23:12:15] valhallasw`cloud: tools is doing about 60reqs/s and the general proxy is doing about 90 reqs/s [23:12:50] Cool [23:13:05] * valhallasw`cloud is actually asleep [23:13:14] valhallasw`cloud: nice sleep talking [23:13:17] valhallasw`cloud: gooood night [23:13:36] ;) g'night [23:32:46] hi all, is it normal that if I list the instances for a project (wikitolearn-dev) I do not see any instance [23:32:57] CristianCantoro: hi! log out and back in? [23:33:11] ok YuviPanda! [23:33:28] enwp policies is just idiotism. like their duplicating lots of files which are availible on commons just on authors' whims. I hope some day we'll have global policy forbidding WPs storing free images once they are moved to Commons and we can delete all that stuff from there [23:34:19] YuviPanda: solved, thanks! (This is *black magic*) [23:35:14] YuviPanda: you need to make a bot which looks for keywords and if it matches the common labs issues, it replies 'have you tried logging out and back in?' :) [23:35:20] JohnFLewis: :D [23:35:30] JohnFLewis: LoL!