Bogumił Kamiński, February 13, 2023
The tutorial is for DataFrames.jl 1.5.0
A brief introduction to basic usage of DataFrames.
The tutorial contains a specification of the project environment version under which it should be run. In order to prepare this environment, before using the tutorial notebooks, while in the project folder run the following command in the command line:
julia -e 'using Pkg; Pkg.activate("."); Pkg.instantiate()'
Tested under Julia 1.9.0. The project dependencies are the following:
[69666777] Arrow v2.4.3
[6e4b80f9] BenchmarkTools v1.3.2
[336ed68f] CSV v0.10.9
[324d7699] CategoricalArrays v0.10.7
[8be319e6] Chain v0.5.0
[944b1d66] CodecZlib v0.7.1
[a93c6f00] DataFrames v1.5.0
[1313f7d8] DataFramesMeta v0.13.0
[5789e2e9] FileIO v1.16.0
[da1fdf0e] FreqTables v0.4.5
[7073ff75] IJulia v1.24.0
[babc3d20] JDF v0.5.1
[9da8a3cd] JLSO v2.7.0
[b9914132] JSONTables v1.0.3
[86f7a689] NamedArrays v0.9.6
[2dfb63ee] PooledArrays v1.4.2
[f3b207a7] StatsPlots v0.15.4
[bd369af6] Tables v1.10.0
[a5390f91] ZipFile v0.10.1
[9a3f8284] Random
[10745b16] Statistics v1.9.0
I will try to keep the material up to date as the packages evolve.
This tutorial covers DataFrames.jl and CategoricalArrays.jl, as they constitute the core of DataFrames.jl along with selected file reading and writing packages.
In the last extras part mentions selected functionalities of selected useful packages that I find useful for data manipulation, currently those are: FreqTables.jl, DataFramesMeta.jl StatsPlots.jl.
Changelog:
Date Changes 2017-12-05 Initial release 2017-12-06 Added description ofinsert!
, merge!
, empty!
, categorical!
, delete!
, DataFrames.index
2017-12-09 Added performance tips 2017-12-10 Added pitfalls 2017-12-18 Added additional worthwhile packages: FreqTables and DataFramesMeta 2017-12-29 Added description of filter
and filter!
2017-12-31 Added description of conversion to Matrix
2018-04-06 Added example of extracting a row from a DataFrame
2018-04-21 Major update of whole tutorial 2018-05-01 Added byrow!
example 2018-05-13 Added StatPlots
package to extras 2018-05-23 Improved comments in sections 1 do 5 by Jane Herriman 2018-07-25 Update to 0.11.7 release 2018-08-25 Update to Julia 1.0 release: sections 1 to 10 2018-08-29 Update to Julia 1.0 release: sections 11, 12 and 13 2018-09-05 Update to Julia 1.0 release: FreqTables section 2018-09-10 Added CSVFiles section to chapter on load/save 2018-09-26 Updated to DataFrames 0.14.0 2018-10-04 Updated to DataFrames 0.14.1, added haskey
and repeat
2018-12-08 Updated to DataFrames 0.15.2 2019-01-03 Updated to DataFrames 0.16.0, added serialization instructions 2019-01-18 Updated to DataFrames 0.17.0, added passmissing
2019-01-27 Added Feather.jl file read/write 2019-01-30 Renamed StatPlots.jl to StatsPlots.jl and added Tables.jl 2019-02-08 Added groupvars
and groupindices
functions 2019-04-27 Updated to DataFrames 0.18.0, dropped JLD2.jl 2019-04-30 Updated handling of missing values description 2019-07-16 Updated to DataFrames 0.19.0 2019-08-14 Added JSONTables.jl and Tables.columnindex
2019-08-16 Added Project.toml and Manifest.toml 2019-08-26 Update to Julia 1.2 and DataFrames 0.19.3 2019-08-29 Add example how to compress/decompress CSV file using CodecZlib 2019-08-30 Add examples of JLSO.jl and ZipFile.jl by xiaodaigh 2019-11-03 Add examples of JDF.jl by xiaodaigh 2019-12-08 Updated to DataFrames 0.20.0 2020-05-06 Updated to DataFrames 0.21.0 (except load/save and extras) 2020-11-20 Updated to DataFrames 0.22.0 (except DataFramesMeta.jl which does not work yet) 2020-11-26 Updated to DataFramesMeta.jl 0.6; update by @pdeffebach 2021-05-15 Updated to DataFrames.jl 1.1.1 2021-05-15 Updated to DataFrames.jl 1.2 and DataFramesMeta.jl 0.8, added Chain.jl instead of Pipe.jl 2021-12-12 Updated to DataFrames.jl 1.3 2022-10-05 Updated to DataFrames.jl 1.4 2023-02-13 Updated to DataFrames.jl 1.5
DataFrame
, DataFrame!
, Tables.rowtable
, Tables.columntable
, Matrix
, eachcol
, eachrow
, Tables.namedtupleiterator
, empty
, empty!
size
, nrow
, ncol
, describe
, names
, eltypes
, first
, last
, getindex
, setindex!
, @view
, isapprox
, metadata
, metadata!
, colmetadata
, colmetadata!
missing
(singleton instance of Missing
), ismissing
, nonmissingtype
, skipmissing
, replace
, replace!
, coalesce
, allowmissing
, disallowmissing
, allowmissing!
, completecases
, dropmissing
, dropmissing!
, disallowmissing
, disallowmissing!
, passmissing
CSV
(package), CSVFiles
(package), Serialization
(module), CSV.read
, CSV.write
, save
, load
, serialize
, deserialize
, Arrow.write
, Arrow.Table
(from Arrow.jl package), JSONTables
(package), arraytable
, objecttable
, jsontable
, CodecZlib
(module), GzipCompressorStream
, GzipDecompressorStream
, JDF.jl
(package), JDF.save
, JDF.load
, JLSO.jl
(package), JLSO.save
, JLSO.load
, ZipFile.jl
(package), ZipFile.reader
, ZipFile.writer
, ZipFile.addfile
rename
, rename!
, hcat
, insertcols!
, categorical!
, columnindex
, hasproperty
, select
, select!
, transform
, transform!
, combine
, Not
, All
, Between
, ByRow
, AsTable
sort!
, sort
, issorted
, append!
, vcat
, push!
, view
, filter
, filter!
, deleteat!
, unique
, nonunique
, unique!
, allunique
, repeat
, parent
, parentindices
, flatten
, @chain
(from Chain.jl
package), only
, subset
, subset!
, shuffle
, prepend!
, pushfirst!
, insert!
, keepat!
categorical
, cut
, isordered
, ordered!
, levels
, unique
, levels!
, droplevels!
, unwrap
, recode
, recode!
innerjoin
, leftjoin
, leftjoin!
, rightjoin
, outerjoin
, semijoin
, antijoin
, crossjoin
stack
, unstack
groupby
, mapcols
, parent
, groupcols
, valuecols
, groupindices
, keys
(for GroupedDataFrame
), combine
, select
, select!
, transform
, transform!
, @chain
(from Chain.jl
package)freqtable
, prop
, Name
@with
, @subset
, @select
, @transform
, @orderby
, @by
, @combine
, @eachrow
, @newcol
, ^
, $
@df
, plot
, density
, histogram
,boxplot
, violin
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4