Python

Search

convert pandas dataframe to spark dataframe

import pandas as pd
from pyspark.sql import SparkSession

filename = <'path to file'>
spark = SparkSession.build.appName('pandasToSpark').getOrCreate()
# Assuming file is csv
pandas_df = pd.read_csv(filename)
spark_df = spark.CreateDataFrame(pandas_df)

Comment

spark to pandas

pandas_df = spark_df.select("*").toPandas()

Comment

dataframe pandas to spark


from pyspark.sql import SparkSession
#Create PySpark SparkSession
spark = SparkSession.builder 
    .master("local[1]") 
    .appName("SparkByExamples.com") 
    .getOrCreate()
#Create PySpark DataFrame from Pandas
sparkDF=spark.createDataFrame(pandasDF) 
sparkDF.printSchema()
sparkDF.show()

#Outputs below schema & DataFrame

root
 |-- Name: string (nullable = true)
 |-- Age: long (nullable = true)

+------+---+
|  Name|Age|
+------+---+
| Scott| 50|
|  Jeff| 45|
|Thomas| 54|
|   Ann| 34|
+------+---+

Comment

spark to pandas

pandas_df = some_df.toPandas()

Comment

spark df to pandas df

some_df = sc.parallelize([
 ("A", "no"),
 ("B", "yes"),
 ("B", "yes"),
 ("B", "no")]
 ).toDF(["user_id", "phone_number"])
pandas_df = some_df.toPandas()

Comment

PREVIOUS	NEXT

Code Example
Python :: find all occurrences of an element in a list python
Python :: numpy diff
Python :: python match statement
Python :: pyqt5 qcombobox get selected item
Python :: sendgrid django smtp
Python :: python area of rectangle
Python :: or statement python
Python :: replace all characters in a string python
Python :: make a condition statement on column pandas
Python :: merge lists
Python :: socket get hostname of connection python
Python :: python enum advanced
Python :: numpy array from list
Python :: # remove punctuation
Python :: to string python
Python :: how to make a nice login django form
Python :: flask session timeout
Python :: python string manipulation
Python :: how to run shell command ctrl + c in python script
Python :: print environment variables windows python
Python :: httplib python
Python :: merge two query sets django
Python :: break python
Python :: python change directory to previous
Python :: range python
Python :: python leetcode
Python :: anagram python
Python :: numpy sqrt
Python :: iterate through a list
Python :: maximum element in dataframe row

Search

PYTHON

convert pandas dataframe to spark dataframe

spark to pandas

dataframe pandas to spark

spark to pandas

spark df to pandas df

ADD CONTENT