What’s new in 2.3.0 (Month XX, 2024)#
These are the changes in pandas 2.3.0. See Release notes for a full changelog including other versions of pandas.
Upcoming changes in pandas 3.0#
Enhancements#
enhancement1#
Other enhancements#
The semantics for the
copykeyword in__array__methods (i.e. called when usingnp.array()ornp.asarray()on pandas objects) has been updated to work correctly with NumPy >= 2 (GH 57739)Series.str.decode()result now hasStringDtypewhenfuture.infer_stringis True (GH 60709)to_hdf()andto_hdf()now round-trip withStringDtype(GH 60663)Improved
reprofNumpyExtensionArrayto account for NEP51 (GH 61085)The
Series.str.decode()has gained the argumentdtypeto control the dtype of the result (GH 60940)The
cumsum(),cummin(), andcummax()reductions are now implemented forStringDtypecolumns (GH 60633)The
sum()reduction is now implemented forStringDtypecolumns (GH 59853)
Notable bug fixes#
These are bug fixes that might have notable behavior changes.
notable_bug_fix1#
API changes#
When enabling the
future.infer_stringoption: Index set operations (like union or intersection) will now ignore the dtype of an emptyRangeIndexor emptyIndexwith object dtype when determining the dtype of the resulting Index (GH 60797)
Deprecations#
Deprecated allowing non-
boolvalues fornainstr.contains(),str.startswith(), andstr.endswith()for dtypes that do not already disallow these (GH 59615)Deprecated the
"pyarrow_numpy"storage option forStringDtype(GH 60152)
Performance improvements#
Bug fixes#
Categorical#
Datetimelike#
Timedelta#
Timezones#
Numeric#
Enabled
Series.modeandDataFrame.modewithdropna=Falseto sort the result for all dtypes in the presence of NA values; previously only certain dtypes would sort (GH 60702)
Conversion#
Strings#
Bug in
DataFrameGroupBy.min(),DataFrameGroupBy.max(),Resampler.min(),Resampler.max()on string input of all NA values would return float dtype; now returns string (GH 60810)Bug in
DataFrame.sum()withaxis=1,DataFrameGroupBy.sum()orSeriesGroupBy.sum()withskipna=True, andResampler.sum()onStringDtypewith all NA values resulted in0and is now the empty string""(GH 60229)Bug in
Series.__pos__()andDataFrame.__pos__()did not raise forStringDtypewithstorage="pyarrow"(GH 60710)Bug in
Series.rank()forStringDtypewithstorage="pyarrow"incorrectly returning integer results in case ofmethod="average"and raising an error if it would truncate results (GH 59768)Bug in
Series.replace()withStringDtypewhen replacing with a non-string value was not upcasting toobjectdtype (GH 60282)Bug in
Series.str.replace()whenn < 0forStringDtypewithstorage="pyarrow"(GH 59628)Bug in
ser.str.slicewith negativestepwithArrowDtypeandStringDtypewithstorage="pyarrow"giving incorrect results (GH 59710)Bug in the
centermethod onSeriesandIndexobjectstraccessors with pyarrow-backed dtype not matching the python behavior in corner cases with an odd number of fill characters (GH 54792)
Interval#
Indexing#
Fixed bug in
Index.get_indexer()round-tripping through string dtype wheninfer_stringis enabled (GH 55834)
Missing#
MultiIndex#
I/O#
DataFrame.to_excel()was storing decimals as strings instead of numbers (GH 49598)
Period#
Plotting#
Groupby/resample/rolling#
Reshaping#
Sparse#
ExtensionArray#
Styler#
Other#
Fixed usage of
inspectwhen the optional dependenciespyarroworjinja2are not installed (GH 60196)