Module talk:String
See also
- Module:String ( | talk | history | links | watch | logs)
sub
Why does sub only return a single character? It returns characters in the strong from position "i" to position "i" (only a single character). Shouldn't it go from "i" to "j" like the lua support page suggests? Banaticus (talk) 00:04, 21 February 2013 (UTC)
error_category
The documentation/comment in the top says: error_category: The default category is ... [Category:Errors reported by Module String]. The category has regular double brackets I assume, or is there an exception in play? -DePiep (talk) 02:44, 27 February 2013 (UTC)
- Yeah, the issue is that double bracket gets interpreted as open / close comment in Lua, so you can't really write that in the middle of the documentation without messing things up. Dragons flight (talk) 03:44, 27 February 2013 (UTC)
Error category: two arguments for one?
To set (overrule) the error cat, one can use two arguments: error_category=...
and no_category=true
. Why is that not one single argument: just enter error_category=<blank>
could withhold the category adding. As it is now, there even is the futile situation: error_category=[[MyCategory]]
and no_category=true
. -DePiep (talk) 12:02, 6 March 2013 (UTC)
- The presence of two parameters came about in an effort to support the existing templates. As I recall, some want one kind of control, and some want the other. It could probably be standardized, but in the initial migration I was trying to avoid making too many changes to the behavior of existing templates. There is also a bit of notation problem if one has a default category, since then it becomes unclear whether
error_category=
(empty string) is meant as "no category" or as "use the default category". Dragons flight (talk) 17:24, 9 March 2013 (UTC)
Match
Is there a way to use match to eliminate hyphens from ISBN numbers? For instance: 978-1-4200-9050-X to 978142009050X. I tried,
{{#invoke:String|match|s=978-1-4200-9050-X|pattern=^(%d*)-*(%d*)-*(%d*)-*(%d*)-*(%d*X*)}} > 978
but I couldn't make work. Anybody can help me? —– Jaider Msg 20:06, 12 March 2013 (UTC)
- If you just want to eliminate hyphens, shouldn't you replace them with empty strings, i.e.
- {{#invoke:String|replace| source=978-1-4200-9050-X | pattern=- | replace= }} = 978142009050X
- You can use match to ensure that the input or output has the appropriate ISBN form, if that is also important. Dragons flight (talk) 00:50, 13 March 2013 (UTC)
- Thanks! But my question is not just about ISBNs. How can we access several values returned from {{#invoke:String|match|...}}? (in other words, several (...) in patterns). And how can we use match to ensure that the input and output has the appropriate ISBN form? —– Jaider Msg 01:15, 13 March 2013 (UTC)
- At present, you can't access multiple (...), not from a template anyway. This is something I should think about how to address. As to using match for checking, something like:
- {{#invoke:String|match|s=978-1-4200-9050-X|pattern=^%d[%d-]*X?$ | nomatch = Not ISBN }} = 978-1-4200-9050-X
- {{#invoke:String|match|s=978-1-BARK-9050-X|pattern=^%d[%d-]*X?$ | nomatch = Not ISBN }} = Not ISBN
- Will work if you aren't picky about the number of digits or the placement of dashes. If you want to be careful about the details you can build a more sophisticated test by using several test calls or writing a short script in Lua. Dragons flight (talk) 02:08, 13 March 2013 (UTC)
- Great! Well, I am not a programmer and I am not sure about Lua stuff, but I made the following script:
local p = {}
function p.isbn(frame)
local isbnString = frame.args[1] or ""
local value1, value2, value3, value4, value5 = string.match(isbnString, "^(%d*)-*(%d*)-*(%d*)-*(%d*)-*(%d*X*)")
return value1, value2, value3, value4, value5
end
return p
And it works ({{#invoke:SomePage|isbn|978-1-4200-9050-X}}
= 978142009050X). Could it be a kind of a solution for several "(...)" in patterns? —– Jaider Msg 12:46, 13 March 2013 (UTC)
- Yes, Lua can match and return multiple patterns. The tricky part is writing a template interface that could access that in a sensible way, especially if you don't know in advance how many capture patterns (...) the template author might want to use. The string module exists mostly to support legacy template code and to provide some string functionality to editors who understand templates but aren't willing to try Lua directly. For a simple dedicated task, like finding an ISBN, writing a short Lua script is probably easier. Congratulations on your first one. Dragons flight (talk) 13:50, 13 March 2013 (UTC)
Pages as strings?
Is it possible to modify this script to allow whole pages as input? For example, if one wanted to include information about article size to Wikipedia:Vital articles? Or would 1000 instances of the script be too much to run on every single page load? — Yerpo Eh? 12:07, 28 April 2013 (UTC)
- Yes, there are ways to operate on an entire page's content, though if all you wanted was page size then the parser function {{PAGESIZE:page name}} probably makes more sense. However, loading entire pages is expensive, which means somewhat slow and limited to no more than 500 times per page. That limit applies to the PAGESIZE: parser function as well, so neither Lua nor PAGESIZE: would work if you needed 1000 iterations on a single page. Dragons flight (talk) 17:15, 28 April 2013 (UTC)
- There are ways to minimize the need to call expensive parser functions repeatedly in some cases. In Lua the result can be stored in a variable for reuse. Similar can be done with templates by passing the result of an expensive parser function as the value of a template argument/parameter. Either way the result could be printed a trillion times with one invocation or template call, as long as the time allocated for Lua and template expansions isn't exceeded. If you look at the result of {{#invoke:string|rep|{{PAGESIZE}}•|1000}} for example, the page size is displayed 1000 times with only one invoke, because the expensive parser function was only called once. This wouldn't work for the specific use case you have in mind though, as there are more then 500 different pages to get the page size for. --darklama 19:52, 29 April 2013 (UTC)
Not really a script-related subquestion, but come to think about it, parsing pages isn't necessarily unavoidable if all I want is page size. Is there an on-wiki handle available to extract it from page history? — Yerpo Eh? 05:51, 30 April 2013 (UTC)
- Besides {{PAGESIZE:page name}}, the MediaWiki API can be queried through JavaScript to find out page sizes. JavaScript is probably going to be the only way you will be able to include the current size for every page. --darklama 10:53, 30 April 2013 (UTC)
Replication on other wikis
Hi, I have just discovered this new "Lua programming" functionality in wikimedia (I am an Italian user). I want to make a question (I don't know if other users have already talked about this). I have noticed that all wikis are replicating this base library (String), changing only the error messages (localization). Does not exists a feature for using only 1 shared String library among wikis (like images on Commons), instead of replicating it for each wiki? "String" is a very base library and if someone discover a bug here, the fix should to be propagated in each wiki (or vice versa).
If a shared library is not possible, at least it would be better to set the localization error strings as variables at the beginning of the source code, so that in other wikis we can cut&paste all the remaining part of the code without changing a line.
If somewhere you have already talked about these problems I would be happy to read about it. Thanks! --Rotpunkt (talk) 11:52, 1 May 2013 (UTC)
- No sharing mechanism currently exists, other than cut and paste. There has been general discussion at the WMF about creating a central code repository for key scripts, but that is likely to be at least months away. Yes, we should do a better job of making localization easier. Dragons flight (talk) 13:39, 1 May 2013 (UTC)
- Ok, thanks. I will looking for that discussion. As a repository, would be nice for example if the modules on Commons (http://commons.wikimedia.org/wiki/Commons:Lua/Modules) could be called from all wikis, so that we could put there the most used libraries (like String), in the same way we use Commons for images. I don't know where we could ask for such a feature... here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto ? --Rotpunkt (talk) 14:39, 1 May 2013 (UTC)
- Translations might be possible with something like
msg = mw.message.new('Empty string'):plain();
. If I've understood the documentation correctly the message is retrieved fromMediaWiki:Empty_string
. $1, $2, etc. can be filled in by including additional parameters tomw.message.new
. It might also be possible to useMediaWiki:Empty_string/it
to include both English and Italian translations for example. If I've understood the documentation correctly this would cut down on needing to edit the module at all. --darklama 16:27, 1 May 2013 (UTC)
- To fully localize the script would be necessary also to localize the default error category.--Moroboshi (talk) 06:56, 3 May 2013 (UTC)
- Well, actually a full localization would also localize the arguments.--Snaevar (talk) 23:57, 3 May 2013 (UTC)
Help needed
![]() |
This help request has been answered. If you need more help, you can , contact the responding user(s) directly on their user talk page, or consider visiting the Teahouse. |
Please do not deactivate this {{help me}} until 09 Jun 2013 unless you are answering my question. I know that anyone who can help probably has this page watchlisted, but just in case... Now, my questions:
- Is there anyway I can shorten the following replace sequence?
{{#invoke:String|replace|{{#invoke:String|replace|{{Str sub old|{{{TEST-STRING}}}|0|25}}|[^%[%]\{}%`%^%-%w]|_|plain=false}}|^[^%[%]\{}%`%^%a]|_|plain=false}}
- The process currently truncates
{{{TEST-STRING}}}
to 25 characters, replaces all characters outside of the "allowed" set[^%[%]\{}%`%^%-%w]
with_
, then finally replaces the first character of the string with_
if it is outside the "allowed" first character set[^%[%]\{}%`%^%a]
- The next question is, how do I test the result of the above process to see if all of the characters have been replaced with
_
?- I was thinking something like
{{#ifeq:{{#invoke:String|len|{{{TEST-STRING}}}|MATCH|100% invalid input...|{{#invoke:String|replace|{{#invoke:String|replace|{{Str sub old|{{{TEST-STRING}}}|0|25}}|[^%[%]\{}%`%^%-%w]|_|plain=false}}|^[^%[%]\{}%`%^%a]|_|plain=false}}}}
but I don't know how to count the instances of "_" in the string to fill in the "MATCH" section...
- I was thinking something like
- Thanks for any help you can offer. :) Technical 13 (talk) 18:13, 7 June 2013 (UTC)
- i think maybe it would be better if you try to explain what are you actually trying to do, rather than asking us to suggest methods to optimize some obscure piece of code, no? peace - קיפודנחש (aka kipod) (talk) 18:29, 7 June 2013 (UTC)
- It is for work on the Template:Freenode/sandbox that adds an argument to allow the person leaving the template to specify an IRC handle based on the user's wikipedia username. Technical 13 (talk) 18:34, 7 June 2013 (UTC)
- More accurately, to make sure the inputted string (username) is appropriately modified so it follows the IRC rules for names
- Maximum 25 characters [So truncating the string]
- First character cannot be number [So replacing first character by _]
- No character can be outside a lit of characters (a-z,A-Z,0-9,_) [So replacing all of them by _]
- Correct me if there are more rules/ the rules listed are incomplete.
- TheOriginalSoni (talk) 19:02, 7 June 2013 (UTC)
- i still do not understand what you try to do. let me try to focus the question: are you looking for a template/function that will receive a string and will return a boolean (or 0/1 or whatever) that indicates whether this string is "kosher" (according to some criteria), or are you trying to create something that receives a string and cook a "legal" string out of it? or maybe something else entirely? if it's something else, can you explain it again? maybe i'll have better luck understanding it this time around... peace - קיפודנחש (aka kipod) (talk) 20:13, 7 June 2013 (UTC)
- More accurately, to make sure the inputted string (username) is appropriately modified so it follows the IRC rules for names
- It is for work on the Template:Freenode/sandbox that adds an argument to allow the person leaving the template to specify an IRC handle based on the user's wikipedia username. Technical 13 (talk) 18:34, 7 June 2013 (UTC)
- You should write the logic in Lua instead of parser functions that call Lua, and then replace that mess in your template with
{{#invoke:YourModule|functionName|{{{VARIABLE}}}}}
. Seriously, there's no reason at all to do what you did there. And I also note that those replace calls won't even do what you want, since Freenode doesn't appear to allow UTF-8 in nicks. Anomie⚔ 20:41, 7 June 2013 (UTC)- Anomie would you be willing to help me with that? I don't know how to write the logic in Lua yet. I came here to ask because I knew that there had to be an easier shorter way to do it, but I did not know how. To answer your question kipod, create something that receives a string and cook a "legal" string out of it is the goal. Technical 13 (talk) 21:07, 7 June 2013 (UTC)
- Something vaguely like this should get you started.
- Anomie would you be willing to help me with that? I don't know how to write the logic in Lua yet. I came here to ask because I knew that there had to be an easier shorter way to do it, but I did not know how. To answer your question kipod, create something that receives a string and cook a "legal" string out of it is the goal. Technical 13 (talk) 21:07, 7 June 2013 (UTC)
local p = {}
function p.guessNick( frame )
local username = frame.args[1]
local nick
-- First, strip out non-ASCII as best we can
-- Note this will totally fail for non-Latin-script usernames. Nothing much we can do about that.
nick = mw.ustring.toNFD( username )
nick = string.gsub( nick, '[^\32-\126]', '' )
-- Next, replace other unacceptable characters
if string.match( nick, '^[0-9%-]' ) then
-- Begins with a number, so prepend an underscore
nick = '_' .. nick
end
nick = string.gsub( nick, '[^a-zA-Z0-9_%-%[\%]{|}^`]+', '_' )
-- Cut to 25 characters
nick = string.sub( nick, 1, 25 )
return nick
end
return p
Match problem
Is there a problem with match or is it something that I don't understand? Match is supposed to return the string that matches a pattern.
If I want all of the digits up to and including the '4
' in the string '1234567890
' I do this:
{{#invoke:String|match|1234567890|%w*4|nomatch=no match}}
→ 1234
If I want the length of a string I do this:
{{#invoke:String|len|1234}}
→ 4
If I want to find the length of the matched string I do this:
{{#invoke:String|len|{{#invoke:String|match|1234567890|%w*4|nomatch=nomatch}} }}
→ 5
Isn't '5' the wrong result?
—Trappist the monk (talk) 14:32, 10 October 2013 (UTC)
- Try it without the space on the end. -- WOSlinker (talk) 18:19, 10 October 2013 (UTC)
{{#invoke:String|len|1234}}
→ 4{{#invoke:String|len|1234 }}
→ 5{{#invoke:String|len|{{#invoke:String|match|1234567890|%w*4|nomatch=nomatch}}}}
→ 4
- "O that he were here to write me down an idiot! But, masters, remember that I am an idiot; though it be not written down, yet forget not that I am an idiot." (apologies to Shakespeare's Dogberry).
- —Trappist the monk (talk) 18:52, 10 October 2013 (UTC)
Replace
Hello all, I ran into a problem and would appreciate any help. Trying to use this function to replace strings with [[
does not work, I guess because it tries to parse as a link. Example:
- Trying to transform two words separate by
-
into two wikilinks:
[[{{subst:#invoke:String|replace|Foo-bar|-|]] and [[}}]]
- As you can see, it does not work. Replacing with the HTML code
[
is not a option because it does not create wikilinks.
I understand one could "post-process" that output, like replacing [
with [
using {{Str rep}}
but that's too ugly :) Is there a way to circumvent that? I'm missing something? Cainamarques (talk) 14:00, 3 December 2013 (UTC)
- Your problem is that MediaWiki considers the brackets as well as the braces when trying to parse the wikitext, so it sees it as two potential links ("
[[{{subst:#invoke:String|replace|Foo-bar|-|]]
" and "[[}}]]
") rather than as one parser function call inside brackets that has more brackets in its arguments. There doesn't seem to be any easy way around this. Anomie⚔ 14:57, 3 December 2013 (UTC)
- Just curious here, what exactly are you trying to accomplish? Maybe there is a way to achieve your goal without replacing
-
with]] and [[
? Technical 13 (talk) 15:02, 3 December 2013 (UTC)
I had lunch, thought a little bit and found a solution:
[[{{subst:#invoke:String|replace|Foo-bar|-|]] and {{subst:User:Cainamarques/sandbox}}}}]]
where User:Cainamarques/sandbox just contain two brackets [[
hehe.
Technical 13, definitely there is. All I need is to create wikilinks of two pages that are contained within a page title. These pages are separated by a semicolon and a whitespace, like:
- Wikipédia:Fusão/Central de fusões/Imagem 3D; Estereoscopia
In this example, the pages are "Imagem 3D" and "Estereoscopia". The code is gonna be in a Preload page, and it's necessary to output clean wikicode, so subst:
is needed. So the code below is what I came up to:
[[{{<includeonly>subst:</includeonly>#invoke:String|replace|{{#titleparts:{{PAGENAME}}||3}}|(; )|=]] [[|plain=false}}]]
I guess it's ok. Sorry for my english. Cainamarques (talk) 15:48, 3 December 2013 (UTC)
- Cain, you are aware that Lua Module:s are not subst:itutable, right? Technical 13 (talk) 16:07, 3 December 2013 (UTC)
- I am now :) Makes sense now that I think about it... Thank you, Technical 13. Cainamarques (talk) 16:24, 3 December 2013 (UTC)
- @Technical 13: Yes they are. Look at Module:Unsubst for a prime example. Anomie⚔ 16:33, 3 December 2013 (UTC)
- Cain, I'm guessing by the "Wikipédia:Fusão/Central de fusões/Imagem 3D; Estereoscopia" pagename that this is not on the English Wikipedia here, can you check Special:Version of the the Wikipedia where you want to do this and see if mw:Extension:StringFunctions or something similar is installed? If it is, I may be able to help you with an alternative that will be subst:itutable. Technical 13 (talk) 16:36, 3 December 2013 (UTC)
Anomie is right, modules are substitutable. Anyway my solution above stops working when the simple text "Foo-bar" is changed to the expression I wanna use: {{ #titleparts:{{PAGENAME}} }}
. Even so, it would work if not trying to subst:
the Lua module. Either way, it is not good enough. If not for bugzilla:2777, it would be easy as pie. Technical 13, I'm from pt.wiki, there is no such thing, thank you for the support. I'll try again in the future... Cainamarques (talk) 17:52, 3 December 2013 (UTC)
Another
{{#invoke:String|len|s={{#invoke:String|replace|source= <span style="padding-left: 0.125em;"><!-- 1em/8 : equivalent to a "fine space" -->!</span> |pattern= %b<> }}}}
→ 45
{{#invoke:String|len|s={{#invoke:String|replace|source= <span style="padding-left: 0.125em;">!</span> |pattern= %b<> }}}}
→ 45
{{#invoke:String|len|s={{#invoke:String|replace|source= <span style="padding-left:.125em;">!</span> |pattern= %b<> }}}}
→ 43
The same when using pattern=<.->
:
{{#invoke:String|len|s={{#invoke:String|replace|source= <span style="padding-left: 0.125em;"><!-- 1em/8 : equivalent to a "fine space" -->!</span> |pattern= <.-> }}}}
→ 45
{{#invoke:String|len|s={{#invoke:String|replace|source= {{#tag:nowiki|<span style="padding-left: 0.125em;"><!-- 1em/8 : equivalent to a "fine space" -->!</span>}} |pattern= %b<> }}}}
→ 34
What to do? I am trying to get a result of "1". --Jerome Potts (talk) 19:00, 28 May 2014 (UTC)
- The string module defaults to
|plain=true
which means the search pattern is plain text and is not a regular expression. To fix, add|plain=false
in the above. Johnuniq (talk) 02:38, 29 May 2014 (UTC)
- Oops ! Thank you much. --Jerome Potts (talk) 05:05, 29 May 2014 (UTC)
str.match & str.replace bug
In Lua
string.match("abc,def", "^%w*(%W*)%w*$")
returns ,
but Module:String's implementation
{{#invoke:String|match|s=abc,def|pattern=^%w*(%W*)%w*$|plain=false}}
returns ,def
which is clearly wrong. — {{carismagic|5 February 2014, 08:58}}
- The same problem for replacing with str.replace:
{{#invoke:String|replace|source=abc,def|pattern=^%w*%W*|replace=123|plain=false}}
returns123
instead of123def
. —{{carismagic|5 February 2014, 10:28}}
Uh oh, this looks awkward. These results occur in a module debug console (for example, click "edit" here, then paste in each line beginning with "=" and press Enter, one line at a time):
=string.match('abc,def', '^%w*(%W*)%w*$') , =mw.ustring.match('abc,def', '^%w*(%W*)%w*$') ,def =string.match('abc', '%W') nil =mw.ustring.match('abc', '%W') a
In other words, it's a bug in the mw library. Johnuniq (talk) 10:56, 5 February 2014 (UTC)
- Exactly, but I didn't know where to report the bug, so I wrote about it here. —
{{carismagic|5 February 2014, 11:37}}
- Bugzilla is where you report a bug. In this case, I've filed it for you: T62908. Anomie⚔ 17:15, 5 February 2014 (UTC)
- Thank you for filling it! Next time I'll make an account there and report it myself. —
{{carismagic|5 February 2014, 18:37}}
- Thank you for filling it! Next time I'll make an account there and report it myself. —
- Bugzilla is where you report a bug. In this case, I've filed it for you: T62908. Anomie⚔ 17:15, 5 February 2014 (UTC)
Plain truth
- plain
- A flag indicating that the pattern should be understood as plain text. Defaults to false.
and
- plain
- Boolean flag indicating that target should be understood as plain text and not as a Lua-style regular expression, defaults to true
The parameter flag values for "true" and "false" should be explicitly given in the documentation, because someone looking at this page may not know this information as it differers between programming languages (or they may not know any programming language). At the moment to find what it is to pass into the module that changes the boolean argument "plain" involves hunting in the code at the bottom of the page. -- PBS (talk) 10:27, 28 February 2014 (UTC)
Where can I find the documentation for wild chars?
Where can I find the documentation for wild chars? %w, %d, %s and so on. I have a regexp that matches two or more words and I need to change it into matching three or more words and I can't figure how to use the wild chars. — Ark25 (talk) 06:37, 23 March 2014 (UTC)
- —Trappist the monk (talk) 09:57, 23 March 2014 (UTC)
Error
Hello,
I tried to implement this module on my project, but somehow I got an error: "Script error: Lua error: Internal error: The interpreter exited with status 126." I'm using mediawiki 1.21 --62.65.230.239 (talk) 16:34, 11 May 2014 (UTC)
- You mean, not on a Wikipedia somewhere? If so, a guess would suggest some version problem—incompatible versions of MediaWiki and/or extensions. See Special:Version. Start by finding a very simple module and trying to make that work. Johnuniq (talk) 23:39, 11 May 2014 (UTC)
anchordecode
I created a lossy anchordecode
function at Module:Cite doi. It reverses the effect of the anchorencode:
parser function, though there can be false positives. It's being used for a preload template, but it might be useful for other purposes as well. It should probably be moved into this module. – Minh Nguyễn (talk, contribs) 06:14, 8 October 2014 (UTC)
- IMO, trying to decode anchorencoded text probably means you're doing something wrong. Use a non-lossy encoding of some sort instead. Anomie⚔ 10:12, 8 October 2014 (UTC)
- Ideally, yes. In the case of {{cite doi}}, we had to deal with Citation bot's implementation. I just figured there might be some other use for it, that's all. – Minh Nguyễn (talk, contribs) 07:03, 10 October 2014 (UTC)
Counting
I'm looking for a simple function, maybe I overlooked it somewhere, as it doesn't seem an absurd function. basically I want to count the occurrences of a single character within a string. What I'm particularly thinking of is output from wikidata, which gives a comma-separated list of values. I just want to know how many values, so I thought of counting the commas, plus 1. Maybe there's a more direct way within wikidata, but I've not found that either! Unbuttered parsnip (talk) mytime= Fri 17:30, wikitime= 09:30, 3 April 2015 (UTC)
- I don't know if there is anything built for the purpose, but if you were writing a module, you could use gsub to replace each comma with anything; gsub returns the count as the second value which is easily captured. In a template, you could use this module to replace all non-comma characters with an empty string, then get the length of the result. Johnuniq (talk) 10:25, 3 April 2015 (UTC)
- No I don't fancy writing a module, but you gave the idea of comparing the lengths of the original string and its length after REPLACEing all commas with nothing (plus 1). Unbuttered parsnip (talk) mytime= Fri 19:27, wikitime= 11:27, 3 April 2015 (UTC)
- Not exactly, here is the idea (using "aa,bb,cc" as the example text to be tested):
{{#invoke:String|len|{{#invoke:String|replace|aa,bb,cc|[^,]|plain=false}}}}
→ 2
- The result is 2 which is the number of commas. Johnuniq (talk) 00:06, 4 April 2015 (UTC)
- My idea was
{{#expr:{{str len|abc,def,ghi,klmnopq,r,stz}}-{{str len|{{replace|abc,def,ghi,klmnopq,r,stz|,|}}}}}}
→ 5
and I can more easily see what's going on – I never really mastered regexps. Unbuttered parsnip (talk) mytime= Sat 12:34, wikitime= 04:34, 4 April 2015 (UTC)
- My idea was
- Not exactly, here is the idea (using "aa,bb,cc" as the example text to be tested):
- No I don't fancy writing a module, but you gave the idea of comparing the lengths of the original string and its length after REPLACEing all commas with nothing (plus 1). Unbuttered parsnip (talk) mytime= Fri 19:27, wikitime= 11:27, 3 April 2015 (UTC)
arraytostring
Hello, on it.wiki we added a function called arraytostring (you guess what it does) and we already found many uses for it. I'm just suggesting to adopt it on en.wiki too. It allows to do any repetitive task, such as converting |par1=X|par2=Y|par3=Z... to X, Y, Z... without repeating code, without subtemplates and without specific modules. --Bultro (talk) 13:06, 31 July 2015 (UTC)
Pattern matching
Currently the documentation states:
- pattern
- The pattern or string to find within the string
...
- plain
- Boolean flag indicating that pattern should be understood as plain text and not as a Lua-style regular expression. Defaults to false (to change: plain=true)
I know what a regular expression is but it is not clear to me what a "Lua-style regular expression" is.
If I wanted to pass pass in a regular expression of code |[45]
how would I escape the pipe symbol so that the function did not mistake the pipe symbol as a parameter delimiter?
Or if I passed in plain=false
as a pattern how do I do that without the function interpreting it as a parameter?
-- PBS (talk) 06:37, 13 May 2017 (UTC)
- For pipe, use
{{!}}
(see Template:! which points out that it is a magic word, not a template). That is untested—I'm only assuming that standard method works when invoking a module. For equals, usepattern=
to identify the pattern (that is,pattern=plain=false
). The link at "Scribunto patterns" has documentation. Johnuniq (talk) 08:04, 13 May 2017 (UTC)- I could escape them that way, but the documentation does not explain that there are special characters that need escaping, or that you have to use named parameters if the pattern includes an equals. A few examples of standard use, an example escaping [0-9], and some examples with typical gotchas (like using an = in a pattern) would improve this documentation, because it would allow novices to get up to speed in using these functions much more quickly. As a novice I would appreciate it if an editor experienced in using these these modules could add such documentation. -- PBS (talk) 08:50, 13 May 2017 (UTC)
- "Regular expression" is a poor term to use there, as Lua's patterns aren't as powerful (for example they lack alternation). I see links to documentation explaining Lua patterns (for string and ustring) are already present, but could perhaps be moved into the documentation of the 'pattern' parameter. Anomie⚔ 12:11, 13 May 2017 (UTC)
above br separated entries
Is {{#invoke:string|replace|{{{name}}}| |<br />|plain=false}} doing something except an extra LUA call? --Xoristzatziki (talk) 06:16, 12 August 2017 (UTC)
- It replaces every space with a break, although
plain=false
is not useful. {{#invoke:string|replace|apple banana cherry| |<br />}}
→ apple
banana
cherry- If there is a problem, please link to the page with the problem and outline the issue. Johnuniq (talk) 06:49, 12 August 2017 (UTC)
Protected edit request on 13 December 2018
Please update the code to use common getParameter functions as demonstrated in sandbox. These functions tend to be common. It is also used in Module:StringFunc. Ans (talk) 19:06, 13 December 2018 (UTC)
- Wouldn't it be better to use
getArgs()
from the very well used and accepted Module:Arguments instead of yet-another-module that does more-or-less-the-same-thing? Same for Module:yesno? - —Trappist the monk (talk) 19:14, 13 December 2018 (UTC)
- Module:yesno is OK, but I cannot find function in Module:Arguments that does the same thing --Ans (talk) 23:13, 13 December 2018 (UTC)
- I think there might be some confusion about the proposed edit. My very quick look suggests that certain functions are currently in Module:String and Ans wants to use those functions in another module. The proposal is that the functions be moved from Module:String to a more general module which can be used as required. In a traditional software project, splitting out functions that are used elsewhere is exactly what would be done. I once used an excellent package where every function was in a separate source file and I'm still not sure about that extreme. However I don't know whether splitting short functions from commonly used modules is desirable. Practical issues include the need to monitor more than one module to know what's going on and to ensure this module is not broken, and to apply suitable protections and handle edit requests for the other modules. Moreover other projects would then need to copy yet another module. Repetition is evil but so is complexity. Johnuniq (talk) 02:37, 14 December 2018 (UTC)
- Module:yesno is OK, but I cannot find function in Module:Arguments that does the same thing --Ans (talk) 23:13, 13 December 2018 (UTC)
Not done please establish a consensus for this change here then reactivate the edit request. — xaosflux Talk 00:09, 16 December 2018 (UTC)
escapePattern function
I've added an escapePattern function to the sandbox, together with some test cases. This function can be used from wiki pages to escape Lua string patterns. I intend to use this in Template:Basic portal start page to escape the first argument to {{Transclude selected recent additions}}. The default argument is {{subst:PAGENAME}}
, which was causing problems on portal names with magic characters in. (For example, Portal:T.I., now deleted, was at one point displaying a DYK that contained the text "TGIF".) I don't imagine there will be many other use cases for this function, but I think in some limited circumstances it would prove useful. What do people think about adding it to the module? — Mr. Stradivarius ♪ talk ♪ 15:11, 3 May 2019 (UTC)
- That looks good although I wonder if showing an error for a missing parameter is best. Why not output an empty string if there is no input? Sorry to make a massive diff, but I cleaned the whitespace in the sandbox to use tabs for indents and remove trailing space. Are you aware of the enormous wars that have been fought over portals in recent weeks? Johnuniq (talk) 00:52, 4 May 2019 (UTC)
- @Johnuniq: I did that because I think calling the function without the pattern parameter will always be an error - it indicates someone trying to call the function with the no parameters or the wrong parameters, or of a mistake in template syntax. An empty string argument like
{{#invoke:String|escapePattern|}}
, on the other hand, could happen in a template doing something like{{#invoke:String|escapePattern|{{{foo|}}}}}
; the function outputs an empty string in this case. I admit to not being fully aware of the recent drama surrounding portals when I wrote the function last night, although I knew it was contentious. I've been reading up on the declined ArbCom case this morning. My stance is that if we're going to keep Template:Basic portal start page around, we may as well make sure it works properly. Best — Mr. Stradivarius ♪ talk ♪ 01:50, 4 May 2019 (UTC)- The escape function will be useful. Re portals, I mentioned that because it would not be worth spending time on the portal automation system. Johnuniq (talk) 02:01, 4 May 2019 (UTC)
- I've gone ahead and added the function. I did it in two edits to get a clean diff with the escapePattern function before cleaning up the whitespace. Let me know if you notice anything amiss. — Mr. Stradivarius ♪ talk ♪ 03:20, 4 May 2019 (UTC)
- The escape function will be useful. Re portals, I mentioned that because it would not be worth spending time on the portal automation system. Johnuniq (talk) 02:01, 4 May 2019 (UTC)
- @Johnuniq: I did that because I think calling the function without the pattern parameter will always be an error - it indicates someone trying to call the function with the no parameters or the wrong parameters, or of a mistake in template syntax. An empty string argument like
Making match function available to other modules
Following a request, I've amended the function str.match(frame) to call a subroutine str._match() with the arguments passed as parameters. That should allow the function to be exported for use in other modules. Gonnym has been testing the /sandbox version and reports no problems, so I've updated the main module from that sandbox. Please revert if any unexpected issues arise, although the change is minor so (hopefully) hasn't much potential to screw up. Cheers --RexxS (talk) 16:08, 13 May 2019 (UTC)
- @RexxS:, how can I access
|ignore_errors=
from a module call? --Gonnym (talk) 10:29, 27 November 2019 (UTC) - I was able to find the
|nomatch=
but not to disable the categories. Anyways, I fixed the issue I had so don't need this atm. --Gonnym (talk) 11:04, 27 November 2019 (UTC)- @Gonnym: That's difficult, because the module uses a call at line 509 to create a new frame object from the current frame and then attempts to extract
error_category
,ignore_errors
andno_category
from that frame object (rather than passing these parameters into the call each time). That's fine if the module is being #invoked, but of course fails – as you found – if a function is exported for use in an external module. I'll sandbox a fix and let you know when I think it's working. --RexxS (talk) 13:42, 27 November 2019 (UTC) - @Gonnym: I think the current sandbox will handle
error_category
,ignore_errors
andno_category
now:- In the calling module, you create a pseudo-frame object, for example,
local f = {}
. - Then set
f.args = {}
. - Then set
f.args.error_category = "Your category"
, and/orf.args.ignore_errors = true
and/orf.args.no_category = true
as required. - Finally, call
_match
with parameters( s, pattern, start, match_index, plain_flag, nomatch, f )
.
- In the calling module, you create a pseudo-frame object, for example,
- Let me know if it works for you. Cheers --RexxS (talk) 14:09, 27 November 2019 (UTC)
- @Gonnym: That's difficult, because the module uses a call at line 509 to create a new frame object from the current frame and then attempts to extract
Edit request to implement merges
Please sync the sandbox of this module to merge Module:Join, Module:Str endswith, Module:PatternCount and Module:Text count into it per the applicable TfDs for those modules.
A few notes:
- Module:PatternCount and Module:Text count do the same thing, and are therefore combined into one function called
str.count
. That function also supports a|plain=
parameter, not in either source module, for consistency reasons. - Module:Str endswith is a module that takes arguments from the calling template. I chose to instead implement a pure "does this string ends with this other string" function in Lua, leaving the parameter-handling code to Template:Str endswith (see Template:Str endswith/sandbox).
- I chose not to merge the unused functions of Module:Join, leaving only the
join
function asstr.join
.
* Pppery * survives 00:19, 26 May 2019 (UTC)
- There seems to be an undefined global variable, j, at line 474. Has anyone checked the code that's being merged into this module? Considering the six million plus transclusions, I think we need to do this merge carefully. Is there a plan to merge the documentation? --RexxS (talk) 00:58, 26 May 2019 (UTC)
- The undefined global variable appears to be present in Module:Join as well, so its not an error in the merge per se, just an error in the code being merged, and I'm not sure what it is supposed to refer to. And yes, I do plan to merge the documentation, but it seems less damaging to me to have undocumented code than functions that exist in the documentation but not the code, so I'm not going to merge the documentation until after the code is merged. * Pppery * it has begun... 01:04, 26 May 2019 (UTC)
- Thank you for the plan to merge the documentation. I agree that's the best way to do it. The problem with undefined globals is that they have the potential to interact with other functions that may be merged into this module in future (and that's particularly a risk for simple names like
'j'
. I'm pretty certain that the spurious 'j' is a relic of copying the documentation for table.concat and simply needs to be removed (it will have previously defaulted to false in its original module); do you agree? --RexxS (talk) 01:19, 26 May 2019 (UTC)- Indeed, that does seem likely, feel free to go ahead and remove the j (or make any other changes you feel are necessary to the sandbox). * Pppery * it has begun... 01:22, 26 May 2019 (UTC)
- Thank you for the plan to merge the documentation. I agree that's the best way to do it. The problem with undefined globals is that they have the potential to interact with other functions that may be merged into this module in future (and that's particularly a risk for simple names like
- The undefined global variable appears to be present in Module:Join as well, so its not an error in the merge per se, just an error in the code being merged, and I'm not sure what it is supposed to refer to. And yes, I do plan to merge the documentation, but it seems less damaging to me to have undocumented code than functions that exist in the documentation but not the code, so I'm not going to merge the documentation until after the code is merged. * Pppery * it has begun... 01:04, 26 May 2019 (UTC)
I did a small amount of cleaning in Module:String/sandbox and might do a little more. There is a pairs
problem in the new join function but I'm going to ponder how the following result occurs before thinking about it.
{{#invoke:String/sandbox|join|,|home=unknown|one|two|three|extra=xyz}}
→ one,two,three{{#invoke:String/sandbox|join||home=unknown|one|two|three|extra=xyz}}
→ onetwothree
The last example is currently giving twoonethreeoneunknownonexyz
. What should occur if named parameters are used? For consistency with other functions, I think they should be ignored—that is, ipairs
should be used. Johnuniq (talk) 02:09, 26 May 2019 (UTC)
- Oh, that's obvious. There appears to be no way to specify an empty separator. It ignores the empty parameter and uses "one" as the separator. What should happen? Johnuniq (talk) 02:17, 26 May 2019 (UTC)
- I agree, it should be treated as an empty separator and named parameters should be ignored. The reason for the odd behavior before was because I copied Module:Join's join function nearly verbatim (only changes were reindenting and renaming a variable), and therefore all of its quirks-- but a join function added to Module:String should have a clean API, including handling of empty parameters, unknown parameters, and other errors. I've deactivated the edit request template so that you can continue your code review unencumbered by my changes being prematurely synced live. * Pppery * it has begun... 02:28, 26 May 2019 (UTC)
- @Johnuniq: I dealt with the problems of leading/trailing whitespace in the named prefix parameters in WikidataIB by assuming we'd never use double quotes and then stripping them from the parameter like this:
local lp = (args.linkprefix or args.lp or ""):gsub('"', '')
. You could take a similar course and strip double quotes from the first unnamed parameter, allowing you to use calls like{{#invoke:String/sandbox|join|""|home=unknown|one|two|three|extra=xyz}}
. The other option that occurs to me is to use an extra parameter|nosep=
(or similar) which would be a boolean switch that made the separator the empty string when set to false. Users would still have to supply a dummy value for the first unnamed parameter, though, or you'd have to go through the args moving all of the unnamed parameter indexes up by 1. What do you think? --RexxS (talk) 02:45, 26 May 2019 (UTC)- I prefer Pppery's idea above and I implemented it in the sandbox. The result is that named parameters are ignored and the first parameter is always used as the separator. The separator can be empty—that is very understandable and would be expected by users. Any empty parameters in those following the separator are ignored. Johnuniq (talk) 03:43, 26 May 2019 (UTC)
- BTW, some functions have redundant semicolons while most don't. Should I remove the semicolons? I'm inclined to do that despite it making diffs more complex because people often learn by example and they might think that the semicolons were somehow desirable. Johnuniq (talk) 05:15, 26 May 2019 (UTC)
- That seems to work fine. I'd remove the semicolons and not worry about the diffs. The more prominent a module is, the more we ought to ensure it reflects best practice. --RexxS (talk) 11:44, 26 May 2019 (UTC)
- OK, that makes sense. I've removed the semicolons from the sandbox. * Pppery * it has begun... 15:59, 26 May 2019 (UTC)
- That seems to work fine. I'd remove the semicolons and not worry about the diffs. The more prominent a module is, the more we ought to ensure it reflects best practice. --RexxS (talk) 11:44, 26 May 2019 (UTC)
- @Johnuniq: I dealt with the problems of leading/trailing whitespace in the named prefix parameters in WikidataIB by assuming we'd never use double quotes and then stripping them from the parameter like this:
- I agree, it should be treated as an empty separator and named parameters should be ignored. The reason for the odd behavior before was because I copied Module:Join's join function nearly verbatim (only changes were reindenting and renaming a variable), and therefore all of its quirks-- but a join function added to Module:String should have a clean API, including handling of empty parameters, unknown parameters, and other errors. I've deactivated the edit request template so that you can continue your code review unencumbered by my changes being prematurely synced live. * Pppery * it has begun... 02:28, 26 May 2019 (UTC)
- @Pppery: just making sure as you haven't mentioned it here, is Module:String count part of this merge? --Gonnym (talk) 18:17, 26 May 2019 (UTC)
- @Gonnym: No, that's not being merged into Module:String, it is a separate TfD for which the result is delete. No new functionality needs to be added to Module:String to implement Module:String count in Wikitext other than the functionality added by merging the other modules into it. * Pppery * it has begun... 21:18, 26 May 2019 (UTC)
@RexxS and Johnuniq: Do you have any further tweaks to make to the code or is this ready to be moved to Module:String? * Pppery * it has begun... 02:06, 27 May 2019 (UTC)
- Yes, I think Module:String can be updated from Module:String/sandbox, bearing in mind that all I've done is examine the new code without checking any testcases. I would prefer that
str.count
handled the case of missing source or pattern parameters. However, the calling template ensures the parameters are never nil so if any generalization of count is needed in the future, that problem can be considered then. Johnuniq (talk) 03:38, 27 May 2019 (UTC)- @Johnuniq: What
calling template
, exactly? Module:PatternCount and Module:text count are both modules that do not specifically implement any template but instead were called directly from non-template Wikitext pages. To paraphrase myself from earlier,a [count] function added to Module:String should have a clean API, including handling of empty parameters, unknown parameters, and other errors
, which leads to the question of what{{#invoke:String|count}}
or{{#invoke:String|count|foo}}
should do. Produce a custom error message (respecting|ignore_errors=
? Silently treating any missing values as the empty string?. * Pppery * it has begun... 03:54, 27 May 2019 (UTC)- @Pppery: Hmm, I might be losing it. 24 hours ago I used what-links-here for a module that was being replaced by the new functions. It was used in lots of articles so I then checked what templates used it. I found a template where you had edited the sandbox to call string/sandbox. The template only used invoke if parameters 1 and 2 were nonblank, otherwise it output an empty string. That must have been endswith that I looked at? At any rate, I agree the functions should handle missing parameters so I made another edit to the sandbox to fix that. I checked what mw.ustring.gsub does, and it agrees with string.gsub (as it should) in the way empty strings are handled. For count, the result is that an empty or missing source will give 0 if the pattern is not empty, or 1 if the pattern is empty or missing. If the source is not empty, an empty or missing pattern will give
n+1
wheren
is the number of Unicode characters in source. For example, counting how many times an empty string occurs inabc
would give 4 (an empty string occurs before and after each character). That's what Lua gsub does and is reasonable. Johnuniq (talk) 05:09, 27 May 2019 (UTC)- Yeah, it does look like you were talking about {{str endswith}}, which has that wikitext shim because (a) the template has wierd backwards compatibility requirements and (b) I don't want Module:String to take arguments from the parent frame. In any case, I've reactivated the edit request template. * Pppery * it has begun... 14:07, 27 May 2019 (UTC)
- @Pppery: Hmm, I might be losing it. 24 hours ago I used what-links-here for a module that was being replaced by the new functions. It was used in lots of articles so I then checked what templates used it. I found a template where you had edited the sandbox to call string/sandbox. The template only used invoke if parameters 1 and 2 were nonblank, otherwise it output an empty string. That must have been endswith that I looked at? At any rate, I agree the functions should handle missing parameters so I made another edit to the sandbox to fix that. I checked what mw.ustring.gsub does, and it agrees with string.gsub (as it should) in the way empty strings are handled. For count, the result is that an empty or missing source will give 0 if the pattern is not empty, or 1 if the pattern is empty or missing. If the source is not empty, an empty or missing pattern will give
- @Johnuniq: What