Simply put, Java 18 adds a new command line tool named jwebserver.
Usage example: jwebserver -p 9000
Running the command fires up a simple web server that serves static files
from the current directory.
You can define a custom port via -p and a custom directory via -d.
Right now the server is only intended for education, experiments, testing
and similar, it’s not intended for the production usage.
For our example we place all the files in the same folder.
To test the Simple Web Server we provide an optional html
file and a json
file.
We can start the Web Server with the command jwebserver
from the command line inside our folder.
The default web server will serve the static pages using the port 8000,
example with http://localhost:8000/test.json
.
If you want to change the port you can use the parameter -p [port number]
. You can find the list of the available options in the official
documentation of jwebserver
Instantiate the Simple Web Server from a Java class
If you want to build your own custom simple web server, or you want to start a server from an application you can use the
class SimpleFileServer:
In our example we can start the server from the code after having defined
the port and the root folder of the static files.
The goal is to provide an educational server that can start from the
command line, without external dependencies.
The server has no ambitions, but it could be useful for trainings and,
important for us, to quickly generate some REST responses during the
development of our frontend application.
It’s very common to create a small node.js server to return a simple JSON
to quickly see results in our UI during the frontend development. I
created a post for the node.js implementation, you can find the link at
the bottom of the page.
The new Java Simple Web server allows us to simulate a web service with
just a JSON file and the command jwebserver
, without the need to create a simple web server from scratch.
The implementation is very limited, it handles only GET requests. It’s
possible to extend the features for our test purposes using the SimpleFileServer
class.
UTF-8 by Default
The conversion between raw bytes and the Java programming language’s
16-bit char values is governed by a charset. US-ASCII, UTF-8, and
ISO-8859-1 are examples of supported charsets.
In JDK 17 and earlier standard Java APIs normally utilized the default
charset if no charset option is given At startup, the JDK
determines the default charset based on the run-time environment, which
includes the operating system, the user’s locale, and other
considerations. e.g: On Windows, it is a codepage-based charset such
as windows-1252
and On macOS, it is UTF-8.
That is changed, when not specified explicitly, the default charset that
JDK will now pick for you is always UTF-8.
We can view default charset by running following command
java -XshowSettings:properties
The Problem
The Java standard character set determines how Strings are converted to
bytes and vice versa in numerous methods of the JDK class library (e.g.,
when writing and reading a text file). These include, for example:
-
the constructors of
FileReader
, FileWriter
, InputStreamReader
, OutputStreamWriter
,
-
the constructors of
Formatter
and Scanner
,
-
the static methods
URLEncoder.encode()
and URLDecoder.decode()
.
This can lead to unpredictable behavior when an application is developed
and tested in one environment – and then run in another (where Java chooses
a different default character set).
For example, let's run the following code on Linux or macOS (the Japanese
text is "Happy Coding!" according to Google Translate):
try (FileWriter fw = new FileWriter("happy-coding.txt");
BufferedWriter bw = new BufferedWriter(fw)) {
bw.write("Java18");
}
Code language: Java (java)
And then, we load this file with the following code on Windows:
try (FileReader fr = new FileReader("happy-coding.txt");
BufferedReader br = new BufferedReader(fr)) {
String line = br.readLine();
System.out.println(line);
}
Code language: Java (java)
Then the following is displayed:
�ッピーコーディング�
Code language: plaintext (plaintext)
That is because Linux and macOS store the file in UTF-8 format, and Windows
tries to read it in Windows-1252 format.
The Problem – Stage Two
It becomes even more chaotic because newer class library methods do not
respect the default character set but always use UTF-8 if no character set
is specified. These methods include, for example, Files.writeString()
, Files.readString()
, Files.newBufferedWriter()
,
and Files.newBufferedReader()
.
Let's start the following program, which writes the Japanese text
via FileWriter
and reads
it directly afterward via Files.readString()
:
try (FileWriter fw = new FileWriter("happy-coding.txt");
BufferedWriter bw = new BufferedWriter(fw)) {
bw.write("Java18");
}
String text = Files.readString(Path.of("happy-coding.txt"));
System.out.println(text);
Code language: Java (java)
Linux and macOS display the correct Japanese text. On Windows, however, we
see only question marks:
???????????
Code language: plaintext (plaintext)
That is because, on Windows, FileWriter
writes the file using the standard Java character set Windows-1252,
but Files.readString()
reads the file back in as UTF-8 – regardless of the standard
character set.
Possible Solutions to Date
For protecting an application against such errors, there have been two
possibilities so far:
-
Specify the character set when calling all methods that convert strings
to bytes and vice versa.
-
Set the default character set via system property "file.encoding".
The first option leads to a lot of code duplication and is thus messy and
error-prone:
FileWriter fw = new FileWriter("happy-coding.txt", StandardCharsets.UTF_8);
FileReader fr = new FileReader("happy-coding.txt", StandardCharsets.UTF_8);
Files.readString(Path.of("happy-coding.txt"), StandardCharsets.UTF_8);
Code language: Java (java)
Specifying the character set parameters also prevents us from using method
references, as in the following example:
Stream<String> encodedParams = ...
Stream<String> decodedParams = encodedParams.map(URLDecoder::decode);
Code language: Java (java)
Instead, we would have to write:
Stream<String> encodedParams = ...
Stream<String> decodedParams =
encodedParams.map(s -> URLDecoder.decode(s, StandardCharsets.UTF_8));
Code language: Java (java)
The second possibility (system property "file.encoding") was firstly not
officially documented up to and including Java 17 (see system properties documentation).
Secondly, as explained above, the character set specified is not used for
all API methods. So the variant is also error-prone, as we can show with the
example from above:
public class Jep400Example {
public static void main(String[] args) throws IOException {
try (FileWriter fw = new FileWriter("happy-coding.txt");
BufferedWriter bw = new BufferedWriter(fw)) {
bw.write("Java18");
}
String text = Files.readString(Path.of("happy-coding.txt"));
System.out.println(text);
}
}
Code language: Java (java)
Let's run the program once with standard encoding US-ASCII:
$ java -Dfile.encoding=US-ASCII Jep400Example.java
?????????????????????????????????
Code language: plaintext (plaintext)
The result is garbage because FileWriter
takes the default encoding into account, but Files.readString()
ignores it and always uses UTF-8. So this variant only works reliably
if you use UTF-8 uniformly:
$ java -Dfile.encoding=UTF-8 Jep400Example.java
Java18
Code language: plaintext (plaintext)
JEP 400 to the Rescue
With JDK Enhancement Proposal 400, the problems mentioned above will –
at least for the most part – be a thing of the past as of Java 18.
The default encoding will always be UTF-8 regardless of the operating
system, locale, and language settings.
Also, the system property "file.encoding" will be documented – and we can
use it legitimately. However, we should do this with caution. The fact that
the Files
methods ignore
the configured default encoding will not be changed by JEP 400.
According to the documentation, only the values "UTF-8" and "COMPAT"
should be used anyway, with UTF-8 providing consistent encoding and COMPAT
simulating pre-Java 18 behavior. All other values lead to unspecified
behavior.
Quite possibly, "file.encoding" will be deprecated in the future and later
removed to eliminate the remaining potential source of errors (methods that
respect the default encoding vs. those that do not).
The best way is always to set "-Dfile.encoding" to UTF-8 or omit it
altogether.
Reading the Encodings at Runtime
The current default encoding can be read at runtime via Charset.defaultCharset()
or the system property "file.encoding". Since Java 17, the
system property "native.encoding" can be used to read the encoding, which –
before Java 18 – would be the default encoding if none is specified:
System.out.println("Default charset : " + Charset.defaultCharset());
System.out.println("file.encoding : " + System.getProperty("file.encoding"));
System.out.println("native.encoding : " + System.getProperty("native.encoding"));
Code language: Java (java)
Without specifying -Dfile.encoding
, the program prints the following on Linux and macOS with Java 17 and Java
18:
Default charset : UTF-8
file.encoding : UTF-8
native.encoding : UTF-8
Code language: plaintext (plaintext)
On Windows and Java 17, the output is as follows:
Default charset : windows-1252
file.encoding : Cp1252
native.encoding : Cp1252
Code language: plaintext (plaintext)
And on Windows and Java 18:
Default charset : UTF-8
file.encoding : UTF-8
native.encoding : Cp1252
Code language: plaintext (plaintext)
So the native encoding on Windows remains the same, but the default
encoding changes to UTF-8 according to this JEP.
The Previous "Default" Character Set
If we run the little program from above on Linux or macOS and Java 17 with
the -Dfile.encoding=default
parameter, we get the following output:
Default charset : US-ASCII
file.encoding : default
native.encoding : UTF-8
Code language: plaintext (plaintext)
This is because the name "default" was previously recognized as an alias
for the encoding "US-ASCII".
In Java 18, this is changed: "default" is no longer recognized; the output
looks like this:
Default charset : UTF-8
file.encoding : default
native.encoding : UTF-8
Code language: plaintext (plaintext)
The system property "file.encoding" is still "default" – but at this point,
we would also see any other invalid input. The default character set for an
invalid "file.encoding" input is always UTF-8 as of Java 18 or corresponds
to the native encoding up to Java 17.
Charset.forName() Taking Fallback Default Value
Not part of the above JEP and not defined in any other JEP is the new
method Charset.forName(String charsetName, Charset fallback)
. This method returns the specified fallback value instead of throwing
an IllegalCharsetNameException
or an UnsupportedCharsetException
if the character set name is unknown or the character set is not
supported.