Java 简明教程
Java - Character Class
通常,当我们处理字符时,我们使用原始数据类型char。
Normally, when we work with characters, we use primitive data types char.
Example
char ch = 'a';
// Unicode for uppercase Greek omega character
char uniChar = '\u039A';
// an array of chars
char[] charArray ={ 'a', 'b', 'c', 'd', 'e' };
Use of Character Class in Java
然而在开发中,我们会遇到需要使用对象来代替原始数据类型的情况。为了实现这一点,Java为原始数据类型char提供了包装类`@ {s0}`。
However in development, we come across situations where we need to use objects instead of primitive data types. In order to achieve this, Java provides wrapper class Character for primitive data type char.
Java Character Class
`Character`类提供了一些有用的类(即静态)方法来操作字符。你可以使用`Character`构造函数创建一个`Character`对象 -
The Character class offers a number of useful class (i.e., static) methods for manipulating characters. You can create a Character object with the Character constructor −
Character ch = new Character('a');
在某些情况下,Java编译器还会为你创建一个`Character`对象。例如,如果你将一个原始char传递给一个期望对象的函数,编译器会自动为你将char转换成一个`Character`。此功能称为自动装箱或拆箱,如果转换相反。
The Java compiler will also create a Character object for you under some circumstances. For example, if you pass a primitive char into a method that expects an object, the compiler automatically converts the char to a Character for you. This feature is called autoboxing or unboxing, if the conversion goes the other way.
Example of Java Character Class
// Here following primitive char 'a'
// is boxed into the Character object ch
Character ch = 'a';
// Here primitive 'x' is boxed for method test,
// return is unboxed to char 'c'
char c = test('x');
Escape Sequences
一个带有反斜杠(\)的前缀字符是一个转义序列,并且对编译器有特殊的含义。
A character preceded by a backslash (\) is an escape sequence and has a special meaning to the compiler.
换行符(\n)在本教程中经常在`System.out.println()`语句中使用,以便在字符串打印后换到下一行。
The newline character (\n) has been used frequently in this tutorial in System.out.println() statements to advance to the next line after the string is printed.
下表显示了Java转义序列 -
Following table shows the Java escape sequences −
Escape Sequence |
Description |
\t |
Inserts a tab in the text at this point. |
\b |
Inserts a backspace in the text at this point. |
\n |
Inserts a newline in the text at this point. |
\r |
Inserts a carriage return in the text at this point. |
\f |
Inserts a form feed in the text at this point. |
\' |
Inserts a single quote character in the text at this point. |
\" |
Inserts a double quote character in the text at this point. |
当在打印语句中遇到转义序列时,编译器会相应地对其进行解释。
When an escape sequence is encountered in a print statement, the compiler interprets it accordingly.
Example: Escape Sequences
如果你想在引号内放入引号,则必须在内部引号上使用转义序列 \”,-
If you want to put quotes within quotes, you must use the escape sequence, \", on the interior quotes −
public class Test {
public static void main(String args[]) {
System.out.println("She said \"Hello!\" to me.");
}
}
She said "Hello!" to me.
Character Class
Declaration
以下是对`@ {s1}`类的声明 -
Following is the declaration for java.lang.Character class −
public final class Character
extends Object
implements Serializable, Comparable<Character>
Field
以下是`@ {s2}`类的字段 -
Following are the fields for java.lang.Character class −
-
static byte COMBINING_SPACING_MARK − This is the General category "Mc" in the Unicode specification.
-
static byte CONNECTOR_PUNCTUATION − This is the General category "Pc" in the Unicode specification.
-
static byte CONTROL − This is the General category "Cc" in the Unicode specification.
-
static byte CURRENCY_SYMBOL − This is the General category "Sc" in the Unicode specification.
-
static byte DASH_PUNCTUATION − This is the General category "Pd" in the Unicode specification.
-
static byte DECIMAL_DIGIT_NUMBER − This is the General category "Nd" in the Unicode specification.
-
static byte DIRECTIONALITY_ARABIC_NUMBER − This is the Weak bidirectional character type "AN" in the Unicode specification.
-
static byte DIRECTIONALITY_BOUNDARY_NEUTRAL − This is the Weak bidirectional character type "BN" in the Unicode specification.
-
static byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR − This is the Weak bidirectional character type "CS" in the Unicode specification.
-
static byte DIRECTIONALITY_EUROPEAN_NUMBER − This is the Weak bidirectional character type "EN" in the Unicode specification.
-
static byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR − This is the Weak bidirectional character type "ES" in the Unicode specification.
-
static byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR − This is the Weak bidirectional character type "ET" in the Unicode specification.
-
static byte DIRECTIONALITY_LEFT_TO_RIGHT − This is the Strong bidirectional character type "L" in the Unicode specification.
-
static byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING − This is the Strong bidirectional character type "LRE" in the Unicode specification.
-
static byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE − This is the Strong bidirectional character type "LRO" in the Unicode specification.
-
static byte DIRECTIONALITY_NONSPACING_MARK − This is the Weak bidirectional character type "NSM" in the Unicode specification.
-
static byte DIRECTIONALITY_OTHER_NEUTRALS − This is the Neutral bidirectional character type "ON" in the Unicode specification.
-
static byte DIRECTIONALITY_PARAGRAPH_SEPARATOR − This is the Neutral bidirectional character type "B" in the Unicode specification.
-
static byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT − This is the Weak bidirectional character type "PDF" in the Unicode specification.
-
static byte DIRECTIONALITY_RIGHT_TO_LEFT − This is the Strong bidirectional character type "R" in the Unicode specification.
-
static byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC − This is the Strong bidirectional character type "AL" in the Unicode specification.
-
static byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING − This is the Strong bidirectional character type "RLE" in the Unicode specification.
-
static byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE − This is the Strong bidirectional character type "RLO" in the Unicode specification.
-
static byte DIRECTIONALITY_SEGMENT_SEPARATOR − This is the Neutral bidirectional character type "S" in the Unicode specification.
-
static byte DIRECTIONALITY_UNDEFINED − This is the Undefined bidirectional character type.
-
static byte DIRECTIONALITY_WHITESPACE − This is the Neutral bidirectional character type "WS" in the Unicode specification.
-
static byte ENCLOSING_MARK − This is the General category "Me" in the Unicode specification.
-
static byte END_PUNCTUATION − This is the General category "Pe" in the Unicode specification.
-
static byte FINAL_QUOTE_PUNCTUATION − This is the General category "Pf" in the Unicode specification.
-
static byte FORMAT − This is the General category "Cf" in the Unicode specification.
-
static byte INITIAL_QUOTE_PUNCTUATION − This is the General category "Pi" in the Unicode specification.
-
static byte LETTER_NUMBER − This is the General category "Nl" in the Unicode specification.
-
static byte LINE_SEPARATOR − This is the General category "Zl" in the Unicode specification.
-
static byte LOWERCASE_LETTER − This is the General category "Ll" in the Unicode specification.
-
static byte MATH_SYMBOL − This is the General category "Sm" in the Unicode specification.
-
static int MAX_CODE_POINT − This is the maximum value of a Unicode code point.
-
static char MAX_HIGH_SURROGATE − This is the maximum value of a Unicode high-surrogate code unit in the UTF-16 encoding.
-
static char MAX_LOW_SURROGATE − This is the maximum value of a Unicode low-surrogate code unit in the UTF-16 encoding.
-
static int MAX_RADIX − This is the maximum radix available for conversion to and from strings.
-
static char MAX_SURROGATE − This is the maximum value of a Unicode surrogate code unit in the UTF-16 encoding.
-
static char MAX_VALUE − This is the constant value of this field is the largest value of type char, '\uFFFF'.
-
static int MIN_CODE_POINT − This is the minimum value of a Unicode code poin
-
static char MIN_HIGH_SURROGATE − This is the minimum value of a Unicode high-surrogate code unit in the UTF-16 encoding.
-
static char MIN_LOW_SURROGATE − This is the minimum value of a Unicode low-surrogate code unit in the UTF-16 encoding.
-
static int MIN_RADIX − This is the minimum radix available for conversion to and from strings.
-
static int MIN_SUPPLEMENTARY_CODE_POINT − This is the minimum value of a supplementary code point.
-
static char MIN_SURROGATE − This is the minimum value of a Unicode surrogate code unit in the UTF-16 encoding.
-
static char MIN_VALUE − This is the constant value of this field is the smallest value of type char, '\u0000'.
-
static byte MODIFIER_LETTER − This is the General category "Lm" in the Unicode specification.
-
static byte MODIFIER_SYMBOL − This is the General category "Sk" in the Unicode specification.
-
static byte NON_SPACING_MARK − This is the General category "Mn" in the Unicode specification.
-
static byte OTHER_LETTER − This is the General category "Lo" in the Unicode specification.
-
static byte OTHER_NUMBER − This is the General category "No" in the Unicode specification.
-
static byte OTHER_PUNCTUATION − This is the General category "Po" in the Unicode specification.
-
static byte OTHER_SYMBOL − This is the General category "So" in the Unicode specification.
-
static byte PARAGRAPH_SEPARATOR − This is the General category "Zp" in the Unicode specification.
-
static byte PRIVATE_USE − This is the General category "Co" in the Unicode specification.
-
static int SIZE − This is the number of bits used to represent a char value in unsigned binary form.
-
static byte SPACE_SEPARATOR − This is the General category "Zs" in the Unicode specification.
-
static byte START_PUNCTUATION − This is the General category "Ps" in the Unicode specification.
-
static byte SURROGATE − This is the General category "Cs" in the Unicode specification.
-
static byte TITLECASE_LETTER − This is the General category "Lt" in the Unicode specification.
-
static Class<Character> TYPE − This is the Class instance representing the primitive type char.
-
static byte UNASSIGNED − This is the General category "Cn" in the Unicode specification.
-
static byte UPPERCASE_LETTER − This is the General category "Lu" in the Unicode specification.
Class constructors
Sr.No. |
Constructor & Description |
1 |
Character(char value) This constructs a newly allocated Character object that represents the specified char value. |
Class methods
Sr.No. |
Method & Description |
1 |
static int charCount(int codePoint)This method determines the number of char values needed to represent the specified character (Unicode code point). |
2 |
char charValue()This method returns the value of this Character object. |
3 |
static int codePointAt(char[] a, int index)This method returns the code point at the given index of the char array. |
4 |
static int codePointBefore(char[] a, int index)This method returns the code point preceding the given index of the char array. |
5 |
static int codePointCount(char[] a, int offset, int count)This method returns the number of Unicode code points in a subarray of the char array argument |
6 |
int compareTo(Character anotherCharacter)This method compares two Character objects numerically. |
7 |
static int digit(char ch, int radix)This method returns the numeric value of the character ch in the specified radix. |
8 |
boolean equals(Object obj)This method compares this object against the specified object |
9 |
static char forDigit(int digit, int radix)This method determines the character representation for a specific digit in the specified radix. |
10 |
static byte getDirectionality(char ch)This method returns the Unicode directionality property for the given character. |
11 |
static int getNumericValue(char ch)This method returns the int value that the specified Unicode character represents. |
12 |
static int getType(char ch)This method returns a value indicating a character’s general category. |
13 |
int hashCode()This method returns a hash code for this Character. |
14 |
static boolean isDefined(char ch)This method determines if a character is defined in Unicode. |
15 |
static boolean isDigit(char ch)This method determines if the specified character is a digit. |
16 |
static boolean isHighSurrogate(char ch)This method determines if the given char value is a high-surrogate code unit (also known as leading-surrogate code unit). |
17 |
static boolean isIdentifierIgnorable(char ch)This method determines if the specified character should be regarded as an ignorable character in a Java identifier or a Unicode identifier. |
18 |
static boolean isISOControl(char ch)This method determines if the specified character is an ISO control character. |
19 |
static boolean isJavaIdentifierPart(char ch)This method determines if the specified character may be part of a Java identifier as other than the first character. |
20 |
static boolean isJavaIdentifierStart(char ch)This method determines if the specified character is permissible as the first character in a Java identifier. |
21 |
static boolean isLetter(char ch)This method determines if the specified character is a letter. |
22 |
static boolean isLetterOrDigit(char ch) This method determines if the specified character is a letter or digit. |
23 |
static boolean isLowerCase(char ch)This method determines if the specified character is a lowercase character. |
24 |
static boolean isLowSurrogate(char ch)This method determines if the given char value is a low-surrogate code unit (also known as trailing-surrogate code unit). |
25 |
static boolean isMirrored(char ch)This method determines whether the character is mirrored according to the Unicode specification. |
26 |
static boolean isSpaceChar(char ch)This method determines if the specified character is a Unicode space character. |
27 |
static boolean isSupplementaryCodePoint(int codePoint)This method determines whether the specified character (Unicode code point) is in the supplementary character range. |
28 |
static boolean isSurrogatePair(char high, char low)This method determines whether the specified pair of char values is a valid surrogate pair. |
29 |
static boolean isTitleCase(char ch)This method determines if the specified character is a titlecase character. |
30 |
static boolean isUnicodeIdentifierPart(char ch)This method determines if the specified character may be part of a Unicode identifier as other than the first character. |
31 |
static boolean isUnicodeIdentifierStart(char ch)This method determines if the specified character is permissible as the first character in a Unicode identifier. |
32 |
static boolean isUpperCase(char chThis method determines if the specified character is an uppercase character. |
33 |
static boolean isValidCodePoint(int codePoint) This method determines whether the specified code point is a valid Unicode code point value in the range of 0x0000 to 0x10FFFF inclusive. |
34 |
static boolean isWhitespace(char ch)This method determines if the specified character is white space according to Java. |
35 |
static int offsetByCodePoints(char[] a, int start, int count, int index, int codePointOffset)This method returns the index within the given char subarray that is offset from the given index by codePointOffset code points |
36 |
static char reverseBytes(char ch)This method returns the value obtained by reversing the order of the bytes in the specified char value. |
37 |
static char[] toChars(int codePoint)This method converts the specified character (Unicode code point) to its UTF-16 representation stored in a char array. |
38 |
static int toCodePoint(char high, char low)This method converts the specified surrogate pair to its supplementary code point value. |
39 |
static char toLowerCase(char ch)This method converts the character argument to lowercase using case mapping information from the UnicodeData file. |
40 |
String toString()This method returns a String object representing this Character’s value. |
41 |
static char toTitleCase(char ch)This method converts the character argument to titlecase using case mapping information from the UnicodeData file. |
42 |
static char toUpperCase(char ch)This method converts the character argument to uppercase using case mapping information from the UnicodeData file. |
43 |
static Character valueOf(char c)This method returns a Character instance representing the specified char value. |
Methods inherited
此类从以下类中继承方法:
This class inherits methods from the following classes −
-
java.lang.Object
Example
以下示例展示了 Java Character charCount() 方法的用法。在此程序中,我们创建了一个 int 变量并为其分配等效于一个 char 值的十六进制值。然后,使用 charCount() 方法,我们检查它是否是有效的附加字符。然后打印结果。
The following example shows the usage of Java Character charCount() method. In this program, we’ve created a int variable and assigned it a Hexadecimal value equivalent to a char value. Then using charCount() method, we’ve checked if it is a valid supplementary character or not. Then result is printed.
package com.tutorialspoint;
public class CharacterDemo {
public static void main(String[] args) {
// create and assign values to int codepoint cp
int cp = 0x12345;
// create an int res
int res;
// assign the result of charCount on cp to res
res = Character.charCount(cp);
String str1 = "It is not a valid supplementary character";
String str2 = "It is a valid supplementary character";
// print res value
if ( res == 1 ) {
System.out.println( str1 );
} else if ( res == 2 ) {
System.out.println( str2 );
}
}
}