Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

convert pandas dataframe to spark dataframe

import pandas as pd
from pyspark.sql import SparkSession

filename = <'path to file'>
spark = SparkSession.build.appName('pandasToSpark').getOrCreate()
# Assuming file is csv
pandas_df = pd.read_csv(filename)
spark_df = spark.CreateDataFrame(pandas_df)
Comment

spark to pandas

pandas_df = spark_df.select("*").toPandas()
Comment

dataframe pandas to spark


from pyspark.sql import SparkSession
#Create PySpark SparkSession
spark = SparkSession.builder 
    .master("local[1]") 
    .appName("SparkByExamples.com") 
    .getOrCreate()
#Create PySpark DataFrame from Pandas
sparkDF=spark.createDataFrame(pandasDF) 
sparkDF.printSchema()
sparkDF.show()

#Outputs below schema & DataFrame

root
 |-- Name: string (nullable = true)
 |-- Age: long (nullable = true)

+------+---+
|  Name|Age|
+------+---+
| Scott| 50|
|  Jeff| 45|
|Thomas| 54|
|   Ann| 34|
+------+---+
Comment

spark to pandas

pandas_df = some_df.toPandas()
Comment

spark df to pandas df

some_df = sc.parallelize([
 ("A", "no"),
 ("B", "yes"),
 ("B", "yes"),
 ("B", "no")]
 ).toDF(["user_id", "phone_number"])
pandas_df = some_df.toPandas()
Comment

PREVIOUS NEXT
Code Example
Python :: find all occurrences of an element in a list python 
Python :: numpy diff 
Python :: python match statement 
Python :: pyqt5 qcombobox get selected item 
Python :: sendgrid django smtp 
Python :: python area of rectangle 
Python :: or statement python 
Python :: replace all characters in a string python 
Python :: make a condition statement on column pandas 
Python :: merge lists 
Python :: socket get hostname of connection python 
Python :: python enum advanced 
Python :: numpy array from list 
Python :: # remove punctuation 
Python :: to string python 
Python :: how to make a nice login django form 
Python :: flask session timeout 
Python :: python string manipulation 
Python :: how to run shell command ctrl + c in python script 
Python :: print environment variables windows python 
Python :: httplib python 
Python :: merge two query sets django 
Python :: break python 
Python :: python change directory to previous 
Python :: range python 
Python :: python leetcode 
Python :: anagram python 
Python :: numpy sqrt 
Python :: iterate through a list 
Python :: maximum element in dataframe row 
ADD CONTENT
Topic
Content
Source link
Name
8+8 =