[No...] Posted September 18 Share Posted September 18 (edited) What is the correct way to read an entire (big) text file into a string in PCM? I tried something like this: File="C:\test.txt" TestString=File.asFilename.contentsOfEntireFile display(TestString) The first two lines work, but if the text file exceeds a certain size, the display() function fails, always complaining about a "missing second quotation mark", followed by various other errors. The same happens with other string operations, e.g. len(TestString). Is this a PCM bug or what am I doing wrong? I know there is readListFile(), but in the end I need a string. And as soon as I convert the list to a string, the same errors occur. Edited September 18 Link to comment Share on other sites More sharing options...
[Ri...] Posted September 18 Share Posted September 18 How many characters are in the string? Guessing, maybe the "display" is limited to 256 characters? Link to comment Share on other sites More sharing options...
[Ch...] Posted September 18 Share Posted September 18 (edited) is this file letters and numbers only or many other characters ? Edited September 18 Link to comment Share on other sites More sharing options...
[Ch...] Posted September 18 Share Posted September 18 (edited) Please sign in to view this username. seems to work for me : well this was only 2000 characters, how big is your file ?? EDIT : Also worked with 20,000 characters too. Calypso 2023 7.6 with random generated text from ChatGPT 😁 Edited September 18 Link to comment Share on other sites More sharing options...
[No...] Posted September 18 Author Share Posted September 18 (edited) It's a normal written text file, so may contain all sorts of characters from the ASCII space (nothing below asc(32) except linefeeds). The file in the example above had 29 KB. Not that I would really need to read a file THAT big. I just noticed that PCM starts to act up when a file gets too big and used a random one for tests. I haven't yet figured out the actual limit. Please sign in to view this username. It's not a limitation of the display() function or the display window. I have displayed thousands of characters at once without a problem as long as I created the string solely from within PCM (in a loop for instance). This only happens when I read from a file with Smalltalk extensions. And as mentioned above, it also happens when I use other functions like len() on the resulting string. Edited September 18 Link to comment Share on other sites More sharing options...
[No...] Posted September 18 Author Share Posted September 18 Please sign in to view this quote. Very strange. I use the same Calypso version and it also happened with 2024. I think 20,000 characters is well above the size limit that triggered the error. Maybe try something without quote characters? I haven't checked my files for those. Link to comment Share on other sites More sharing options...
[Ch...] Posted September 18 Share Posted September 18 I made the first and last characters quotation marks " " and added a few more quotes throughout, still functioned here. Link to comment Share on other sites More sharing options...
[No...] Posted September 19 Author Share Posted September 19 (edited) This is insane! Today I created a 30 KB text file containing a repetition of all the ASCII characters from 32 to 127 without any linefeeds. And it works! 😮😮 Size doesn't seem to be the critical factor then. So what the heck is the difference between this and some random hand-typed text file of the same size that makes PCM go crazy? 🤯 And what is even more insane: In the parameter list I can see that the string variable really contains everything from the file in question. So PCM has no problem displaying it there. I just don't get it. Edited September 19 Link to comment Share on other sites More sharing options...
[No...] Posted September 19 Author Share Posted September 19 (edited) HA! Found it! It's the double slash ('//') that makes PCM hiccup. Files size doesn't matter at all. You don't even need a file. Every string containg double slashes causes the error. I guess it interferes with PCM's comment recognition somehow. Sooo, how can I prevent that from happening? Removing those characters is not an option. Edited September 19 2 Link to comment Share on other sites More sharing options...
[Ma...] Posted September 19 Share Posted September 19 (edited) How you work with that text? Can you bit read file / stream / line? I would imagine that that hang would be non direct for reading into variable, but just that opening. Edit: Would it be possible with basic PCM command "readListFile" and converting it into string? Edited September 19 Link to comment Share on other sites More sharing options...
[No...] Posted September 19 Author Share Posted September 19 (edited) I already tried a list file and used .asString as well as some custom PCM code to cut the string from the list. But it's always the same: As soon as I access the string variable with a PCM function, it fails. Edited September 19 Link to comment Share on other sites More sharing options...
[Je...] Posted September 19 Share Posted September 19 (edited) Cont=readListFile("C:\test.txt") for A=1 to Cont.size if A==1 then for B=1 to getTechnologySegment(Cont,A).size Buchst=mid(getTechnologySegment(Cont,A),B,1) if Buchst=="/" then Buchst="°" endif if B==1 then Zeile=Buchst else Zeile=Zeile+Buchst endif next B Text=Zeile+cr() else for B=1 to getTechnologySegment(Cont,A).size Buchst=mid(getTechnologySegment(Cont,A),B,1) if Buchst=="/" then Buchst="°" endif if B==1 then Zeile=Buchst else Zeile=Zeile+Buchst endif next B Text=Text+Zeile+cr() endif next A message(Text) Der Code liest Buchstabe für Buchstabe und ersetzt "/" mit "°". Wie schnell das mit sehr großen Dateien funktioniert, weiß ich allerdings nicht. Leerzeilen werden leider nicht erkannt. Edited September 19 2 1 Link to comment Share on other sites More sharing options...
[No...] Posted September 19 Author Share Posted September 19 (edited) Danke für den Code! Das Problem ist nur, ich darf den String nicht verändern. Die "/" müssen drin bleiben. Ich bräuchte also eine andere Methode, um den unveränderten String bzw. die String-Variable zu handhaben, sozusagen am PCM-Interpreter vorbei. Ich werde da vielleicht mal mit executeCode() experimentieren. Edited September 19 Link to comment Share on other sites More sharing options...
[Je...] Posted September 19 Share Posted September 19 Wofür benötigst du denn den String? Es gibt Wege, die "°" im Nachhinein wieder mit "/" zu ersetzen, etwa wenn der String oder Ausschnitte davon in eine Datei geschrieben werden sollen. Den Code habe ich nochmal überarbeitet, jetzt erkennt er Ä, ä, Ö, ö, Ü, ü, und Leerzeilen korrekt: vC=0 Cont=readListFile("C:\test.txt") for A=1 to Cont.size if A==1 then if getTechnologySegment(Cont,A).size==0 then Zeile=cr() else for B=1 to getTechnologySegment(Cont,A).size Buchst=mid(getTechnologySegment(Cont,A),B,1) if B<getTechnologySegment(Cont,A).size then if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="Ä" then Buchst="Ä" B=B+1 vC=1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="ä" then Buchst="ä" B=B+1 vC=1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="Ö" then Buchst="Ö" B=B+1 vC=1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="ö" then Buchst="ö" B=B+1 vC=1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="Ü" then Buchst="Ü" B=B+1 vC=1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="ü" then Buchst="ü" B=B+1 vC=1 endif endif if Buchst=="/" then Buchst="°" endif if B==1 then Zeile=Buchst else if (B==2) and (vC==1) then Zeile=Buchst else Zeile=Zeile+Buchst endif endif next B endif if getTechnologySegment(Cont,A).size==0 then Text=Zeile else Text=Zeile+cr() endif else if getTechnologySegment(Cont,A).size==0 then Zeile=cr() else for B=1 to getTechnologySegment(Cont,A).size Buchst=mid(getTechnologySegment(Cont,A),B,1)) if B<getTechnologySegment(Cont,A).size then if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="Ä" then Buchst="Ä" B=B+1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="ä" then Buchst="ä" B=B+1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="Ö" then Buchst="Ö" B=B+1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="ö" then Buchst="ö" B=B+1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="Ü" then Buchst="Ü" B=B+1 endif if text(mid(getTechnologySegment(Cont,A),B,1)+mid(getTechnologySegment(Cont,A),B+1,1))=="ü" then Buchst="ü" B=B+1 endif endif if Buchst=="/" then Buchst="°" endif if B==1 then Zeile=Buchst else if (B==2) and (vC==1) then Zeile=Buchst else Zeile=Zeile+Buchst endif endif next B endif if A==(Cont.size) or (getTechnologySegment(Cont,A).size==0) then Text=Text+Zeile else Text=Text+Zeile+cr() endif endif next A display(Text) 1 Link to comment Share on other sites More sharing options...
[No...] Posted September 22 Author Share Posted September 22 Er wird u.a. mit executeCode weiterverarbeitet und da habe ich keine Möglichkeit mehr, vorher was zu ersetzen (zumindest ist mir nichts bekannt). Ich sagte ja oben schon, ich werde mal versuchen, das vielleicht von Anfang an auf die Art zu realisieren und PCM quasi zu umgehen. Ich muss mich da nur etwas tiefer einarbeiten, wie man das formuliert. 1 Link to comment Share on other sites More sharing options...
[No...] Posted Tuesday at 07:27 AM Author Share Posted Tuesday at 07:27 AM Ich denke mein Problem ist gelöst. Es geht leider nur, wenn man die PCM String-Verarbeitung hier komplett umgeht. Vielen Dank für euren Input! @Zeiss: Es wäre es wünschenswert, wenn dieser Bug mal beseitigt werden könnte. 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in