Changelog
Unreleased
- [ENH] Added
row_countparameter for janitor.conditional_join - Issue #1269 @samukweku - [ENH] Reverse deprecation of
pivot_wider()-- Issue #1464 - [ENH] Add accessor and method for pandas DataFrameGroupBy objects. - Issue #587 @samukweku
- [ENH] Call mutate/summarise directly on groupby objects instead. Also add
ungroupmethod to expose underlying dataframe of a grouped object. - Issue #1511 @samukweku - [BUG] Fix incorrect output of
pivot_wider, whereindexandnames_fromis provided, and values_from is None. - Issue #1509 @samukweku
v0.31.0 - 2025-03-07
- [ENH] Added support for pd.Series.select - Issue #1394 @samukweku
- [ENH] Added suport for janitor.mutate - Issue #1226 @samukweku
- [ENH] Added support for janitor.summarise - Issue #1225 @samukweku
- [ENH] Added support for janitor.alias - Issue #1449 @samukweku
v0.30.0 - 2024-12-04
v0.29.2 - 2024-09-28
v0.29.1 - 2024-09-23
v0.29.0 - 2024-09-15
- [DOC] Un-deprecate
join_applyas no alternative currently exists - Issue #1399 @lbeltrame
v0.28.1 - 2024-08-09
v0.28.0 - 2024-08-03
- [ENH] Added a
cartesian_productfunction, as well as anexpandmethod for pandas. - Issue #1293 @samukweku - [ENH] Improve
pivot_longerwhensort_by_appearanceis True. Addedpivot_longer_specfor more control on how the dataframe should be unpivoted. -@samukweku #1361 - [ENH] Added
convert_excel_dateandconvert_matlab_datemethods for polars - Issue #1352 - [ENH] Added a
completemethod for polars. - Issue #1352 @samukweku - [ENH] Added a
pivot_longermethod, and apivot_longer_specfunction for polars - Issue #1352 @samukweku - [ENH] Added a
row_to_namesmethod for polars. Issue #1352 @samukweku - [ENH]
read_commandlinefunction now supports polars - Issue #1352 @samukweku - [ENH]
xlsx_cellsfunction now supports polars - Issue #1352 @samukweku - [ENH]
xlsx_tablefunction now supports polars - Issue #1352 @samukweku - [ENH] Added a
clean_namesmethod for polars - it can be used to clean the column names, or clean column values . Issue #1343 @samukweku - [ENH] Improved performance for non-equi joins when using numba - @samukweku PR #1341
- [ENH] pandas Index,Series, DataFrame now supported in the
completemethod. - PR #1369 @samukweku - [ENH] Improve performance for
first/lastin conditional_join, when the join columns in the right dataframe are sorted. - PR #1382 @samukweku
v0.27.0 - 2024-03-21
- [BUG] Fix logic for groupby in complete. Index support deprecated. Fix deprecation warning for fillna in
completePR #1289 @samukweku - [ENH]
selectfunction now supports variable arguments - PR #1288 @samukweku - [ENH]
conditional_joinnow supports timedelta dtype. - PR #1297 @samukweku - [ENH]
get_join_indicesfunction added - returns only join indices between two dataframes. Issue #1310 @samukweku - [ENH]
explode_indexfunction added. - Issue #1283 - [ENH]
conditional_joinnow supports timedelta dtype. - PR #1297 - [ENH]
change_index_dtypeadded. - @samukweku Issue #1314 - [ENH] Add
glueandaxisparameters tocollapse_levels. - Issue #211 @samukweku - [ENH]
row_to_namesnow supports multiple rows conversion to columns. - @samukweku Issue #1333 - [ENH] Fix warnings from Pandas.
truncate_datetimenow uses a vectorized option. -@samukweku #1337
v0.26.0 - 2023-09-18
- [ENH]
clean_namescan now be applied to column values. Issue #995 @samukweku - [BUG] Fix ImportError - Issue #1285 @samukweku
v0.25.0 - 2023-07-27
- [INF] Replace
pytest.inifile withpyproject.tomlfile. PR #1204 @Zeroto521 - [INF] Extract docstrings tests from all tests. PR #1205 @Zeroto521
- [BUG] Address the
TypeErrorwhen importing v0.24.0 (issue #1201 @xujiboy and @joranbeasley) - [INF] Fixed issue with missing PyPI README. PR #1216 @thatlittleboy
- [INF] Update some
mkdocscompatibility code. PR #1231 @thatlittleboy - [INF] Migrated docstring style from Sphinx to Google for better compatibility with
mkdocstrings. PR #1235 @thatlittleboy - [INF] Prevent selection of chevrons (
>>>) and outputs in Example code blocks. PR #1237 @thatlittleboy - [DEPR] Add deprecation warnings for
process_text,rename_column,rename_columns,filter_on,remove_columns,fill_direction. Issue #1045 @samukweku - [ENH]
pivot_longernow supports named groups wherenames_patternis a regular expression. A dictionary can now be passed tonames_pattern, and is internally evaluated as a list/tuple of regular expressions. Issue #1209 @samukweku - [ENH] Improve selection in
conditional_join. Issue #1223 @samukweku - [ENH] Add
colclass for selecting columns within an expression. Currently limited to use withinconditional_join. PR #1260 @samukweku. - [ENH] Performance improvement for range joins in
conditional_join, whenuse_numba = False. Performance improvement for equi-join and a range join, whenuse_numba = True, for many to many join with wide ranges. PR #1256, #1267 @samukweku - [DEPR] Add deprecation warning for
pivot_wider. Issue #1045 @samukweku - [BUG] Fix string column selection on a MultiIndex. Issue #1265. @samukweku
v0.24.0 - 2022-11-12
- [ENH] Add lazy imports to speed up the time taken to load pyjanitor (part 2)
- [DOC] Updated developer guide docs.
- [ENH] Allow column selection/renaming within conditional_join. Issue #1102. Also allow first or last match. Issue #1020 @samukweku.
- [ENH] New decorator
deprecated_kwargsfor breaking API. #1103 @Zeroto521 - [ENH] Extend select_columns to support non-string columns. Issue #1105 @samukweku
- [ENH] Performance improvement for groupby_topk. Issue #1093 @samukweku
- [ENH]
min_max_scaledropold_minandold_maxto fit sklearn's method API. Issue #1068 @Zeroto521 - [ENH] Add
jointlyoption formin_max_scalesupport to transform each column values or entire values. Default transform each column, similar behavior tosklearn.preprocessing.MinMaxScaler. (Issue #1067, PR #1112, PR #1123) @Zeroto521 - [INF] Require pyspark minimal version is v3.2.0 to cut duplicates codes. Issue #1110 @Zeroto521
- [ENH] Add support for extension arrays in
expand_grid. Issue #1121 @samukweku - [ENH] Add
names_expandandindex_expandparameters topivot_widerfor exposing missing categoricals. Issue #1108 @samukweku - [ENH] Add fix for slicing error when selecting columns in
pivot_wider. Issue #1134 @samukweku - [ENH]
dropnaparameter added topivot_longer. Issue #1132 @samukweku - [INF] Update
mkdocstringsversion and to fit its new coming features. PR #1138 @Zeroto521 - [BUG] Force
math.softmaxreturningSeries. PR #1139 @Zeroto521 - [INF] Set independent environment for building documentation. PR #1141 @Zeroto521
- [DOC] Add local documentation preview via github action artifact. PR #1149 @Zeroto521
- [ENH] Enable
encode_categoricalhandle 2 (or more ) dimensions array. PR #1153 @Zeroto521 - [TST] Fix testcases failing on Window. Issue #1160 @Zeroto521, and @samukweku
- [INF] Cancel old workflow runs via Github Action
concurrency. PR #1161 @Zeroto521 - [ENH] Faster computation for non-equi join, with a numba engine. Speed improvement for left/right joins when
sort_by_appearanceis False. Issue #1102 @samukweku - [BUG] Avoid
change_typemutating originalDataFrame. PR #1162 @Zeroto521 - [ENH] The parameter
column_nameofchange_typetotally supports inputing multi-column now. #1163 @Zeroto521 - [ENH] Fix error when
sort_by_appearance=Trueis combined withdropna=True. Issue #1168 @samukweku - [ENH] Add explicit default parameter to
case_whenfunction. Issue #1159 @samukweku - [BUG] pandas 1.5.x
_MergeOperationdoesn't havecopykeyword anymore. Issue #1174 @Zeroto521 - [ENH]
select_rowsfunction added for flexible row selection. Genericselectfunction added as well. Add support for MultiIndex selection via dictionary. Issue #1124 @samukweku - [TST] Compat with macos and window, to fix
FailedHealthCheckIssue #1181 @Zeroto521 - [INF] Merge two docs CIs (
docs-preview.ymlanddocs.yml) to one. And adddocumentationpytest mark. PR #1183 @Zeroto521 - [INF] Merge
codecov.yml(only works for the dev branch pushing event) intotests.yml(only works for PR event). PR #1185 @Zeroto521 - [TST] Fix failure for test/timeseries/test_fill_missing_timestamp. Issue #1184 @samukweku
- [BUG] Import
DataDescriptionto fix:AttributeError: 'DataFrame' object has no attribute 'data_description'. PR #1191 @Zeroto521
v0.23.1 - 2022-05-03
- [DOC] Updated
fill.pyandupdate_where.pydocumentation with working examples. - [ENH] Deprecate
num_binsfrombin_numericin favour ofbins, and allow generic**kwargsto be passed intopd.cut. Issue #969. @thatlittleboy - [ENH] Fix
concatenate_columnsnot working on category inputs @zbarry - [INF] Simplify CI system @ericmjl
- [ENH] Added "read_commandline" function to janitor.io @BaritoneBeard
- [BUG] Fix bug with the complement parameter of
filter_on. Issue #988. @thatlittleboy - [ENH] Add
xlsx_table, for reading tables from an Excel sheet. @samukweku - [ENH] minor improvements for conditional_join; equality only joins are no longer supported; there has to be at least one non-equi join present. @samukweku
- [BUG]
sort_column_value_orderno longer mutates original dataframe. - [BUG] Extend
fill_empty'scolumn_namestype range. Issue #998. @Zeroto521 - [BUG] Removed/updated error-inducing default arguments in
row_to_names(#1004) andround_to_fraction(#1005). @thatlittleboy - [ENH]
patternsdeprecated in favour of importingre.compile. #1007 @samukweku - [ENH] Changes to kwargs in
encode_categorical, where the values can either be a string or a 1D array. #1021 @samukweku - [ENH] Add
fill_valueandexplicitparameters to thecompletefunction. #1019 @samukweku - [ENH] Performance improvement for
expand_grid. @samukweku - [BUG] Make
factorize_columns(PR #1028) andtruncate_datetime_dataframe(PR #1040) functions non-mutating. @thatlittleboy - [BUG] Fix SettingWithCopyWarning and other minor bugs when using
truncate_datetime_dataframe, along with further performance improvements (PR #1040). @thatlittleboy - [ENH] Performance improvement for
conditional_join. @samukweku - [ENH] Multiple
.valueis now supported inpivot_longer. Multiple values_to is also supported, when names_pattern is a list or tuple.names_transformparameter added, for efficient dtype transformation of unpivoted columns. #1034, #1048, #1051 @samukweku - [ENH] Add
xlsx_cellsfor reading a spreadsheet as a table of individual cells. #929 @samukweku. - [ENH] Let
filter_stringsuit parameters ofSeries.str.containsIssue #1003 and #1047. @Zeroto521 - [ENH]
names_glueinpivot_widernow takes a string form, using str.format_map under the hood.levels_orderis also deprecated. @samukweku - [BUG] Fixed bug in
transform_columnswhich ignored thecolumn_namesspecification whennew_column_namesdictionary was provided as an argument, issue #1063. @thatlittleboy - [BUG]
count_cumulative_uniqueno longer modifies the column being counted in the output whencase_sensitiveargument is set to False, issue #1065. @thatlittleboy - [BUG] Fix for gcc missing error in dev container
- [DOC] Added a step in the dev guide to install
Remote Containerin VS Code. @ashenafiyb - [DOC] Convert
expand_columnandfind_replacecode examples to doctests, issue #972. @gahjelle - [DOC] Convert
expand_columncode examples to doctests, issue #972. @gahjelle - [DOC] Convert
get_dupescode examples to doctests, issue #972. @ethompsy - [DOC] Convert
engineeringcode examples to doctests, issue #972 @ashenafiyb - [DOC] Convert
groupby_topkcode examples to doctests, issue #972. @ethompsy - [DOC] Add doctests to
math, issue #972. @gahjelle - [DOC] Add doctests to
mathandml, issue #972. @gahjelle - [DOC] Add doctests to
math,ml, andxarray, issue #972. @gahjelle
v0.22.0 - 2021-11-21
- [BUG] Fix conditional join issue for multiple conditions, where pd.eval fails to evaluate if numexpr is installed. #898 @samukweku
- [ENH] Added
case_whento handle multiple conditionals and replacement values. Issue #736. @robertmitchellv - [ENH] Deprecate
new_column_namesandmerge_framefromprocess_text. Only existing columns are supported. @samukweku - [ENH]
completeusespd.mergeinternally, providing a simpler logic, with some speed improvements in certain cases overpd.reindex. @samukweku - [ENH]
expand_gridreturns a MultiIndex DataFrame, allowing the user to decide how to manipulate the columns. @samukweku - [INF] Simplify a bit linting, use pre-commit as the CI linting checker. @Zeroto521
- [ENH] Fix bug in
pivot_longerfor wrong output whennames_patternis a sequence with a single value. Issue #885 @samukweku - [ENH] Deprecate
aggfuncfrompivot_wider; aggregation can be chained with pandas'groupby. - [ENH]
As_Categoricaldeprecated fromencode_categorical; a tuple of(categories, order)suffices for **kwargs. @samukweku - [ENH] Deprecate
names_sortfrompivot_wider.@samukweku - [ENH] Add
softmaxtomathmodule. Issue #902. @loganthomas
v0.21.2 - 2021-09-01
- [ENH] Fix warning message in
coalesce, from bfill/fill;coalescenow uses variable arguments. Issue #882 @samukweku - [INF] Add SciPy as explicit dependency in
base.in. Issue #895 @ericmjl
v0.21.1 - 2021-08-29
- [DOC] Fix references and broken links in AUTHORS.rst. @loganthomas
- [DOC] Updated Broken links in the README and contributing docs. @nvamsikrishna05
- [INF] Update pre-commit hooks and remove mutable references. Issue #844. @loganthomas
- [INF] Add GitHub Release pointer to auto-release script. Issue #818. @loganthomas
- [INF] Updated black version in github actions code-checks to match pre-commit hooks. @nvamsikrishna05
- [ENH] Add reset_index flag to row_to_names function. @fireddd
- [ENH] Updated
label_encodeto use pandas factorize instead of scikit-learn LabelEncoder. @nvamsikrishna05 - [INF] Removed the scikit-learn package from the dependencies from environment-dev.yml and base.in files. @nvamsikrishna05
- [ENH] Add function to remove constant columns. @fireddd
- [ENH] Added
factorize_columnsmethod which will deprecate thelabel_encodemethod in future release. @nvamsikrishna05 - [DOC] Delete Read the Docs project and remove all readthedocs.io references from the repo. Issue #863. @loganthomas
- [DOC] Updated various documentation sources to reflect pyjanitor-dev ownership. @loganthomas
- [INF] Fix
isortautomatic checks. Issue #845. @loganthomas - [ENH]
completefunction now uses variable args (*args) - @samukweku - [ENH] Set
expand_column'ssepdefault is"|", same topandas.Series.str.get_dummies. Issue #876. @Zeroto521 - [ENH] Deprecate
limitfrom fill_direction. fill_direction now uses kwargs. @samukweku - [ENH] Added
conditional_joinfunction that supports joins on non-equi operators. @samukweku - [INF] Speed up pytest via
-n(pytest-xdist) option. Issue #881. @Zeroto521 - [DOC] Add list mark to keep
select_columns's example same style. @Zeroto521 - [ENH] Updated
rename_columnsto take optional function argument for mapping. @nvamsikrishna05
v0.21.0 - 2021-07-16
- [ENH] Drop
fill_valueparameter fromcomplete. Users can usefillnainstead. @samukweku - [BUG] Fix bug in
pivot_longerwith single level columns. @samukweku - [BUG] Disable exchange rates API until we can find another one to hit. @ericmjl
- [ENH] Change
coalesceto return columns; also usebfill,ffill, which is faster thancombine_first@samukweku - [ENH] Use
evalfor string conditions inupdate_where. @samukweku - [ENH] Add clearer error messages for
pivot_longer. h/t to @tdhock for the observation. Issue #836 @samukweku - [ENH]
select_columnsnow uses variable arguments (*args), to provide a simpler selection without the need for lists. - @samukweku - [ENH]
encode_categoricalsrefactored to use generic functions viafunctools.dispatch. - @samukweku - [ENH] Updated convert_excel_date to throw meaningful error when values contain non-numeric. @nvamsikrishna05
v0.20.14 - 2021-03-25
- [ENH] Add
dropnaparameter to groupby_agg. @samukweku - [ENH]
completeadds abyparameter to expose explicit missing values per group, via groupby. @samukweku - [ENH] Fix check_column to support single inputs - fixes
label_encode. @zbarry
v0.20.13 - 2021-02-25
- [ENH] Performance improvements to
expand_grid. @samukweku - [HOTFIX] Add
multipledispatchto pip requirements. @ericmjl
v0.20.12 - 2021-02-25
- [INF] Auto-release GitHub action maintenance. @loganthomas
v0.20.11 - 2021-02-24
- [INF] Setup auto-release GitHub action. @loganthomas
- [INF] Deploy
darglintpackage for docstring linting. Issue #745. @loganthomas - [ENH] Added optional truncation to
clean_namesfunction. Issue #753. @richardqiu - [ENH] Added
timeseries.flag_jumps()function. Issue #711. @loganthomas - [ENH]
pivot_longercan handle multiple values in paired columns, and can reshape using a list/tuple of regular expressions innames_pattern. @samukweku - [ENH] Replaced default numeric conversion of dataframe with a
dtypesparameter, allowing the user to control the data types. - @samukweku - [INF] Loosen dependency specifications. Switch to pip-tools for managing dependencies. Issue #760. @MinchinWeb
- [DOC] added pipenv installation instructions @evan-anderson
- [ENH] Add
pivot_widerfunction, which is the inverse of thepivot_longerfunction. @samukweku - [INF] Add
openpyxltoenvironment-dev.yml. @samukweku - [ENH] Reduce code by reusing existing functions for fill_direction. @samukweku
- [ENH] Improvements to
pivot_longerfunction, with improved speed and cleaner code.dtypesparameter dropped; user can change dtypes with pandas'astypemethod, or pyjanitor'schange_typemethod. @samukweku - [ENH] Add kwargs to
encode_categoricalfunction, to create ordered categorical columns, or categorical columns with explicit categories. @samukweku - [ENH] Improvements to
completemethod. Usepd.mergeto handle duplicates and null values. @samukweku - [ENH] Add
new_column_namesparameter toprocess_text, allowing a user to create a new column name after processing a text column. Also added amerge_frameparameter, allowing dataframe merging, if the result of the text processing is a dataframe.@samukweku - [ENH] Add
aggfuncparameter to pivot_wider. @samukweku - [ENH] Modified the
checkfunction in utils to verify if a value is a callable. @samukweku - [ENH] Add a base
_select_columnfunction, usingfunctools.singledispatch, to allow for flexible columns selection. @samukweku - [ENH] pivot_longer and pivot_wider now support janitor.select_columns syntax, allowing for more flexible and dynamic column selection. @samukweku
v0.20.10
- [ENH] Added function
sort_timestamps_monotonicallyto timeseries functions @UGuntupalli - [ENH] Added the complete function for converting implicit missing values to explicit ones. @samukweku
- [ENH] Further simplification of expand_grid. @samukweku
- [BUGFIX] Added copy() method to original dataframe, to avoid mutation. Issue #729. @samukweku
- [ENH] Added
alsomethod for running functions in chain with no return values. - [DOC] Added a
timeseriesmodule section to website docs. Issue #742. @loganthomas - [ENH] Added a
pivot_longerfunction, a wrapper aroundpd.meltand similar to tidyr'spivot_longerfunction. Also added an example notebook. @samukweku - [ENH] Fixed code to returns error if
fill_valueis not a dictionary. @samukweku - [INF] Welcome bot (.github/config.yml) for new users added. Issue #739. @samukweku
v0.20.9
- [ENH] Updated groupby_agg function to account for null entries in the
byargument. @samukweku - [ENH] Added function
groupby_topkto janitor functions @mphirke
v0.20.8
- [ENH] Upgraded
update_wherefunction to use either the pandas query style, or boolean indexing via thelocmethod. Also updatedfind_replacefunction to use thelocmethod directly, instead of routing it through theupdate_wherefunction. @samukweku - [INF] Update
pandasminimum version to 1.0.0. @hectormz - [DOC] Updated the general functions API page to show all available functions. @samukweku
- [DOC] Fix the few lacking type annotations of functions. @VPerrollaz
- [DOC] Changed the signature from str to Optional[str] when initialized by None. @VPerrollaz
- [DOC] Add the Optional type for all signatures of the API. @VPerrollaz
- [TST] Updated test_expand_grid to account for int dtype difference in Windows OS @samukweku
- [TST] Make importing
pandastesting functions follow uniform pattern. @hectormz - [ENH] Added
process_textwrapper function for all Pandas string methods. @samukweku - [TST] Only skip tests for non-installed libraries on local machine. @hectormz
- [DOC] Fix minor issues in documentation. @hectormz
- [ENH] Added
fill_directionfunction for forward/backward fills on missing values for selected columns in a dataframe. @samukweku - [ENH] Simpler logic and less lines of code for expand_grid function @samukweku
v0.20.7
- [TST] Add a test for transform_column to check for nonmutation. @VPerrollaz
- [ENH] Contributed
expand_gridfunction by @samukweku
v0.20.6
- [DOC] Pep8 all examples. @VPerrollaz
- [TST] Add docstrings to tests @hectormz
- [INF] Add
debug-statements,requirements-txt-fixer, andinterrogatetopre-commit. @hectormz - [ENH] Upgraded transform_column to use df.assign underneath the hood, and also added option to transform column elementwise (via apply) or columnwise (thus operating on a series). @ericmjl
v0.20.5
- [INF] Replace
pycodestylewithflake8in order to addpandas-vetlinter @hectormz - [ENH]
select_columns()now raisesNameErrorif column label insearch_columns_labelsis missing fromDataFramecolumns. @smu095
v0.20.1
- [DOC] Added an example for groupby_agg in general functions @samukweku
- [ENH] Contributed
sort_naturally()function. @ericmjl
v0.20.0
- [DOC] Edited transform_column dest_column_name kwarg description to be clearer on defaults by @evan-anderson.
- [ENH] Replace
apply()in favor ofpandasfunctions in several functions. @hectormz - [ENH] Add
ecdf()Series function by @ericmjl. - [DOC] Update API policy for clarity. @ericmjl
- [ENH] Enforce string conversion when cleaning names. @ericmjl
- [ENH] Change
find_replaceimplementation to use keyword arguments to specify columns to perform find and replace on. @ericmjl - [ENH] Add
jitter()dataframe function by @rahosbach
v0.19.0
- [ENH] Add xarray support and clone_using / convert_datetime_to_number funcs by @zbarry.
v0.18.3
- [ENH] Series toset() functionality #570 @eyaltrabelsi
- [ENH] Added option to coalesce function to not delete coalesced columns. @gddcunh
- [ENH] Added functionality to deconcatenate tuple/list/collections in a column to deconcatenate_column @zbarry
- [ENH] Fix error message when length of new_column_names is wrong @DollofCutty
- [DOC] Fixed several examples of functional syntax in
functions.py. @bdice - [DOC] Fix #noqa comments showing up in docs by @hectormz
- [ENH] Add unionizing a group of dataframes' categoricals. @zbarry
- [DOC] Fix contributions hyperlinks in
AUTHORS.rstand contributions by @hectormz - [INF] Add
pre-commithooks to repository by @ericmjl - [DOC] Fix formatting code in
CONTRIBUTING.rstby @hectormz - [DOC] Changed the typing for most "column_name(s)" to Hashable rather than enforcing strings, to more closely match Pandas API by @dendrondal
- [INF] Edited pycodestyle and Black parameters to avoid venvs by @dendrondal
v0.18.2
- [INF] Make requirements.txt smaller @eyaltrabelsi
- [ENH] Add a reset_index parameter to shuffle @eyaltrabelsi
- [DOC] Added contribution page link to readme @eyaltrabelsi
- [DOC] fix example for
update_where, provide a bit more detail, and expand the bad_values example notebook to demonstrate its use by @anzelpwj. - [INF] Fix pytest marks by @ericmjl (issue #520)
- [ENH] add example notebook with use of finance submodule methods by @rahosbach
- [DOC] added a couple of admonitions for Windows users. h/t @anzelpwj for debugging
help when a few tests failed for
win32@Ram-N - [ENH] Pyjanitor for PySpark @zjpoh
- [ENH] Add pyspark clean_names @zjpoh
- [ENH] Convert asserts to raise exceptions by @hectormz
- [ENH] Add decorator functions for missing and error handling @jiafengkevinchen
- [DOC] Update README with functional
pandasAPI example. @ericmjl - [INF] Move
get_features_targets()to newml.pymodule by @hectormz - [ENH] Add chirality to morgan fingerprints in janitor.chemistry submodule by @Clayton-Springer
- [INF]
import_messagesuggests python dist. appropriate installs by @hectormz - [ENH] Add count_cumulative_unique() method to janitor.functions submodule by @rahosbach
- [ENH] Add
update_where()method tojanitor.spark.functionssubmodule by @zjpoh
v0.18.1
- [ENH] extend find_replace functionality to allow both exact match and regular-expression-based fuzzy match by @shandou
- [ENH] add preserve_position kwarg to deconcatenate_column with tests by @shandou and @ericmjl
- [DOC] add contributions that did not leave
gittraces by @ericmjl - [ENH] add inflation adjustment in finance submodule by @rahosbach
- [DOC] clarified how new functions should be implemented by @shandou
- [ENH] add optional removal of accents on functions.clean_names, enabled by default by @mralbu
- [ENH] add camelCase conversion to snake_case on
clean_namesby @ericmjl, h/t @jtaylor for sharing original - [ENH] Added
null_flagfunction which can mark null values in rows. Implemented by @anzelpwj - [ENH] add engineering submodule with unit conversion method by @rahosbach
- [DOC] add PyPI project description
- [ENH] add example notebook with use of finance submodule methods by @rahosbach
For changes that happened prior to v0.18.1, please consult the closed PRs, which can be found here.
We thank all contributors
who have helped make pyjanitor
the package that it is today.