|
Hi,
In my text file, I used a character with value larger than 127 for example 0xDC. Then I loaded that text file in a device. Then I read that text file and that character. Then the character was changed to 0xC3 and 0x9C. How come it change to two character...
Started by sasayins on
, 3 posts
by 3 people.
Answer Snippets (Read the full thread at stackoverflow):
Because that's the sequence for the character when encoded in UTF-8:
>>> '\xc3\x9c'.decode('utf-8') u'\xdc'
From wikipedia:
"UTF-8 encodes each character (code point) in 1 to 4 octets.
|
|
How to I get the Fixnum returned by the following:
"abc"[2]
Back into a character?
Started by Peter Coulton on
, 3 posts
by 3 people.
Answer Snippets (Read the full thread at stackoverflow):
You could do.
A single-character string for "abc"[2] , which will not respond to the chr method.
|
|
What is the character entity for the equal character in HTML? I have been looking and I cannot find the character entity reference for that one character.
EDIT:
I am building a JSLint style validator for HTML. I am not happy with current validators as...
Started by austin cheney on
, 7 posts
by 7 people.
Answer Snippets (Read the full thread at stackoverflow):
=
But—why?
=
Go here my friend
http://www.natural-innovations.com/wa/doc-charset.html
=
If http....
You can use = , but it's not really necessary to escape = in HTML .
= has ASCII value 61 , so the HTML entity is = .
I use asciitable.com.
|
Ask your Facebook Friends
|
I am used to the c-style getchar(), but it seems like there is nothing comparable for java. I am building a lexical analyzer, and I need to read in the input character by character.
I know I can use the scanner to scan in a token or line and parse through...
Started by Jergason on
, 7 posts
by 7 people.
Answer Snippets (Read the full thread at stackoverflow):
(If you....
character data from a list of file arguments:
public class CharacterHandler { public static void with " + ch); } } }
The bad thing about the above code is that it uses the system's default character set the Charset class for more.
|
|
Ruby will not play nice with UTF-8 strings. I am passing data in an XML file and although the XML document is specified as UTF-8 it treats the ascii encoding (two bytes per character) as individual characters.
I have started encoding the input strings...
Started by Tim Reynolds on
, 3 posts
by 3 people.
Answer Snippets (Read the full thread at stackoverflow):
It is probably better to....
If something does break, then please add a comment to let us know .
Does something break because Ruby strings treats UTF-8 encoded code points as two characters? If not, then that you should not worry too much about that.
|
|
I am coding a method that returns whether a given character is valid, which looks like this: -
private static boolean isValid(char c) { return c == '.' || c == ',' || c == '+' || c == '/' || c == ';' || c == ':'; }
Check style flagged this up as the boolean...
Started by Tarski on
, 8 posts
by 5 people.
Answer Snippets (Read the full thread at stackoverflow):
Private static boolean isValid(char c) { String validChars =".,+/;:"; return (validChars.indexOf(c) > -1); }
private static boolean isValid(char c) { switch (c) { case '.' : // FALLTHROUGH case ',' : // FALLTHROUGH case '+' : // FALLTHROUGH case '... .
|
|
I have a Java socket connection that is receiving data intermittently. The number of bytes of data received with each burst varies. The data may or may not be terminated by a well-known character (such as CR or LF). The length of each burst of data is...
Started by Kunal on
, 9 posts
by 9 people.
Answer Snippets (Read the full thread at stackoverflow):
Thus, you would start....
Personally.
That you're not specifying a character encoding in your conversion from bytes to String (via characters: ByteArrayOutputStream can handle various Character set encoding through the toString call.
|
|
Basic question.
char new_str[]=""; char * newstr;
If I have to concatenate some data into it or use string functions like strcat/substr/strcpy, what's the difference between the two?
I understand I have to allocate memory to the char * approach (Line ...
Started by halluc1nati0n on
, 8 posts
by 8 people.
Answer Snippets (Read the full thread at stackoverflow):
This is a character array....
Of pointer to char, the pointer can be incremented new_str++ to fetch you the next character, the line:
char new_str[] = "";
allocates 1 byte of space and puts a null terminator character is done for you.
|
|
In ASCII, the character < is encoded as a single-byte character 0x3C, what I'd like to know is that is there a character set where < is encoded differently? I tried UTF-8, it's the same. I tried GB2312 and it's the same...
Another question, are ...
Started by Tower on
, 4 posts
by 4 people.
Answer Snippets (Read the full thread at stackoverflow):
Characters with codes > 127 are different character before (or....
They are not the same in non-ASCII-character sets (such as EBCDIC).
The first 127 characters of ASCII are the same in all ASCII-derived character sets.
|
|
I am looking for a bash or sed script (preferably a one-liner) with which I can insert a new line character after a fixed number of characters in huge text file.
Started by rangalo on
, 5 posts
by 5 people.
Answer Snippets (Read the full thread at stackoverflow):
character in the whole file
gawk 'BEGIN{ FS=""; ch=30} { for(i=1;i<=NF;i++){ c+=1 if (c==ch){ print in each line eg after every 5th character
gawk 'BEGIN{ FS=""; ch=5} { print substr($0,1,ch) "\n" substr.
|