These are release notes for a version 0.100. We skipped a few version since the last release (0.97), because 0.100 should denote a major change at the very heart of ClosedXML. Not as clean break as I hoped, but close enough.
The list of all things that were changed from 0.97 to 0.100 is at the migration guide at the https://closedxml.readthedocs.io/en/latest/migrations/migrate-to-0.100.html
This is more like list of you should upgrade despite breaking changes :)
Memory consumption during big was decreasedMemory consumption during saving of large data workbooks was significantly improved. Originally, ClosedXML workbook representation was converted to DocumentFomrat.OpenXML DOM representation and the DOM was then saved. Instead of creating whole DOM, sheet data (=cell values) are now directly streamed to the output file and aren't included in the DOM.
To demonstrate difference, see the before and after memory consumption of a report that generated 30 000 rows, 45 columns. Memory consumption has decreased from 2.08 GiB 🡆 0.8 GiB.
Save cells and strings through DOM: 2.08 GiB
Save cell and strings through streaming: 0.8 GiB
The purple area are bytes of uncompressed package zip stream.
Cell value is now strongly typedIXLCell.Value
and IXLCellValue.CachedValue
have now type XLCellValue
. At the core, xlsx consists of addressable cells with a functions that transform a set of values in source cells to different values in target cells. Is is really important to represent potential values of cells by a sane type. All other things, pivot tables, auto filter, graphs rely on this premise.
Cell value has been represented as string text and a value. The string depended on the value, e.g. 0/1 for boolean. That has been the case since the beginning of the ClosedXML project (see the original XLCell). The value was also returned as an Object
.
This approach has several drawbacks
Object
is not suitable representation of cell value. User had no idea what kind of values could be returned as a cell value. Everything could also break down, if a new type would be returned (e.g. XLError
).IXLColumn
.Value of a cell is not represented by a XLCellValue
structure. It is basically a union of one of possible types that can be value of a cell:
Since datetime and duration are basically masqaraded number, you can use XLCellValue.GetUnifiedNumber()
to get a backing number, no matter if the type is number, datetime and duration.
The structure contains implicit operators, as well as other methods to make transaction as seamless as possible
// Will use an implicit cast operator to convert string to XLCellValue and pass it to the Value setter ws.Cell("A1").Value = "Text";
There is also a new singleton Blank.Value
that represent a blank value of a cell. Null is not blank. Empty string is not a blank value of a cell. Null instead of blank was considered and everything is just so much easier to work with, if blank is represented as a custom singleton type and not as a null.
XLCellValue
will be able to represent all values of a cell and won't be boxed/unboxed all the time.
ClosedXML used to guess a data type from a value. It caused all sort of unexpected behaviors (e.g. text value Z12.31 has been converted to date time 12/30/2022 19:00). Date caused most problems, but other sometimes too (e.g. text "Infinity" was detected as a number).
This behavior was likely intended to emulate how user interacts with an Excel. Excel guesses type, but only if the cell Number Format is set to "General" (e.g. if NumberFormat is set to Text, there is no conversion even in Excel). Application is not human and doesn't have to interact with xlsx in the same way.
This behavior was removed. Type that is set is the type that will be returned. Note that although XLCellValue
can represent date and time as a different types, in reality that is only presentation logic for user. They are both just serial date time numbers.
Cell value now can accurately represent error or a blank value.
ClosedXML used to throw on error value and cell couldn't contain an error. That was a significant problem, especially for formula calculation where formula referenced a cell that should contain an error value.
ClosedXML used to represent blank cell as an empty string, but no longer. It uses Blank.Value
singleton, wrapped in XLCellValue
. Also brings significant improvement in accuracy for CalcEngine evaluation.
Excel has a pretty complicated undocumented coercion process from text to number. It can convert fraction text (="1 1/2"*2
is 3), dates (e.g. ="1900-01-05"*2
is 10, though date format is culture specific), percent (e.g. ="100%"*2
), braces imply negative value (="(100%)"*2
= -2) and many more. That causes a significant problems for formula evaluation, especially if the source cell contains a date as a text, not as a date.
ClosedXml used to only convert test that looked like double
, it now coerces nearly everything Excel does. Coercion from dates should mostly work, but Excel has it's own database of acceptable formats and it's own format, while we rely on .NET Core infrastructure.
Thanks to incorporation of XLError
to core of CalcEngine, the exceptions are no longer necessary and have been removed. Error is a normal value type that is used during formula evaluation (e.g. ISNA
accepts it and VLOOKUP
returns it).
Technically speaking CalcEngine can still throw MissingContextException
, but only if evaluation is not called from a cell, but from method like XLWorkbook.Evaluate
. Functions like ROW
just can't work without the context of the cell.
If you ever tried to use CalcEngine, you have encountered a dreaded The function *SomeFunctionwas not recognised.
exception.
ClosedXML will no longer throw an exception on unimplemented function, but will return #NAME?
error instead. It has several reasons
=SOME.UNKNOWN.FUN(4)
, why should it throw on =LARGE(A1:A5,1)
?Basically, the exception doesn't bring any benefit and only imposes costs. User can report missing function on #NAME?
error just like on exception.
CalcEngine now can evaluate array literal expressions, so formulas like VLOOKUP(4, {1,2; 3,2; 5,3; 7,4}, 2)
now actually work.
Array processing is limited to argument parsing across formulas and CalcEngine still needs some love to process it work correctly. Array formulas are still not implemented.
Reimplementation of information and lookup functionsInformation and lookup functions were reimplemented to take advantage of other improvements. They should now be compliant with Excel (with exception of wildcard search for VLOOKUP).
Documentation in the version controlDocumentation is being moved from wiki to the ReadTheDocs. It has been there for since 2019, but we didn't actually had any documentation. Documentation is super important and ClosedXML lacks in that area. It is of course WIP, but it should improve over the time (see https://closedxml.readthedocs.io/en/latest/features/protect.html, https://closedxml.readthedocs.io/en/latest/features/cell-format.html#number-format or infamous https://closedxml.readthedocs.io/en/latest/tips/missing-font.html).
The move to ReadTheDocs has significant advantages:
We are not breaking the compatibility just because. Break imposes heavy penalty on users of the library. That makes it less likely to use it and that is definitely not the goal. Even the ClosedXML.Report must be fixed after every release.
That is not desirable situation. Version 1.0 and semantic versioning is certainly the goal. But it must be with an clear API that can endure some development between minor version. That is just not the case at the moment.
API will be reviewed along with the documentation and will be adjusted as necessary. ClosedXML will practice release early, release often. If breaking changes are not acceptable, stay on version that works and wait for 1.0 (though that will likely take at least a year, likely more... we are on a second decade).
Technically we do semver since forever, since Major version zero (0.y.z) is for initial development. Anything MAY change at any time. The public API SHOULD NOT be considered stable. ). Initial development for a decade /sigh.
Future plansSimilar to current release, the general plan is to work on neglected foundational things and bug fixes.
It is likely there will be 0.100.x to fix whatever bugs XLCellValue caused that weren't convered by tests.
Pivot tables won't get any love in 0.101, but hopefully in the next one. It is one of distinguishing features of ClosedXML and it has a lot of reported issues.
What's ChangedFull Changelog: 0.97.0...0.100.0
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4