To be sure, you are already familiar with JSON: it is one of the most common formats for sharing data as text.
Did you know there are differences tastes Of JSON? One of them is JSONL: It represents a JSON document in which e The items are in different lines Instead of being in an array of items.
It’s a pretty rare format to find, so it can be tricky to figure out how it works and how to analyze it. In this article we will learn how to parse JSONL file with C #.
As explained in Documentation of JSON lines, JSONL file is a file consisting of various items separated by a
So, instead of being
[ "name" : "Davide" , "name" : "Emma" ]
You have a list of items without an array grouping them.
"name" : "Davide" "name" : "Emma"
I must admit I never heard of this format until a few months ago. Or, even better, I’ve already used JSONL files without knowing: JSONL is a common format for logs, Where each value is added to a file in a continuous stream.
Also, JSONL has several features:
- Each item is a valid JSON item
- Each row is separated by a
nCharacter (or by
- It’s encoding using UTF-8
So, now, it’s time to analyze it!
Say you’re creating a video game, and you want to read all the items your character has found:
class Item public int Id get; set; public string Name get; set; public string Category get; set;
The list of items in the JSONL file can be stored as follows:
"id": 1, "name": "dynamite", "category": "weapon" "id": 2, "name": "ham", "category": "food" "id": 3, "name": "nail", "category": "tool"
Now, all we have to do is read the file and analyze it.
Assuming we read the content from a file and stored it in a string called
content, We can Use NewtonSoft Analyze these lines.
As usual, let’s see how to parse the file, then dive deep into what’s happening. (Note: The following excerpt comes from This question In Stack Overflow)
List<Item> items = new List<Item>(); var jsonReader = new JsonTextReader(new StringReader(content)) SupportMultipleContent = true ; var jsonSerializer = new JsonSerializer(); while (jsonReader.Read()) Item item = jsonSerializer.Deserialize<Item>(jsonReader); items.Add(item); return items;
Let’s unpack it:
var jsonReader = new JsonTextReader(new StringReader(content)) SupportMultipleContent = true ;
The first thing to do is create a show of
JsonTextReader, A class that deserves what
Newtonsoft.Json Names space. The constructor receives the
TextReader Instance or any derivative class. So we can use a
StringReader An instance representing a stream from a specified string.
The central part of this section (and somehow, of the whole article) is
SupportMultipleContent Feature: When set to
true It allows the
JsonTextReader Continue reading the content as multi-lines.
His definition, in fact, says that:
public bool SupportMultipleContent get; set;
Finally, we can read the content:
var jsonSerializer = new JsonSerializer(); while (jsonReader.Read()) Item item = jsonSerializer.Deserialize<Item>(jsonReader); items.Add(item);
Here we create a new one
JsonSerializer (Again, comes from Newtonsoft), and use it to read one item at a time.
while (jsonReader.Read()) Allows us to read the stream to the end. And to analyze every item that is in the stream, we use
Deserialize The method is smart enough to analyze each item even without a
, A symbol that separates them, because we have the
Once we have the
Item Object, we can do whatever we want, like add it to the list.
As we have learned, there are different tastes Of JSON. You can read a review of them on Wikipedia.
Of course, the best place to learn more about this format is its official documentation.
This article exists thanks to the question of Amran Kadir asked for Stack Overflow, and of course, to Yuval Yitzhakov’s answer.
Since we used Newtonsoft (known as JSON.NET), you may want to take a look at its website.
Finally, the repository used for this article.
Maybe you’re thinking:
Why did David write an article in response to Stack Overflow? I could just read the same information there!
Well, if you were only interested in the main section, you would be right!
But this article exists for two main reasons.
First, I wanted to emphasize this JSON is not always the best choice for everythingA: It always depends on what we need. For continuous streams of items, JSONL is a good choice (if not the best). Do not choose the Most used Format: Choose the one that best suits your needs!
Second, I wanted to comment on that We should not be too attached to a specific directory: I usually prefer to use original stuff, so for reading JSON files, my first choice is
System.Text.Json. But this is not always the best choice. Yes, we can write a complex solution (like the second answer on Stack Overflow), but … is it worth it? sometimes It is best to use a different directory, even if only for one specific task. So you can use
System.Text.Json For the whole project except for the part where you need to read the JSONL file.
Have you ever encountered an unusual format? How did you deal with it?