Here are a couple of sqldf solutions.
First lets rename the columns containing a dot in their names to remove the dot since dot is an SQL operator. (Had we not wished to do that we could have referred to those columns in the SQL statement as Site_1
and Site_2
and it would have understood that we were referring to Site.1
and Site.2
.)
library(sqldf)
df1 = data.frame(Sites = c("A","B","C"), total = c(12,6,35))
df2 = data.frame(Site1 = c("A","A","B"), Site2 = c("B","C","C"),
Score = c(60,70,80))
Now that we have our inputs lets try a couple of approaches with sqldf:
sqldf with three sql statements
temp1 <- sqldf("SELECT * FROM df1 as a, df2 as b WHERE a.Sites = b.Site1 ")
temp2 <- sqldf("SELECT * FROM df1 as a, df2 as b WHERE a.Sites = b.Site2 ")
sqldf("SELECT
Site1,
b.Site2,
a.Score,
a.Total as Site1Total,
b.Total as Site2Total
FROM temp1 as a, temp2 as b
USING (Site1)
GROUP BY a.Total, b.Total")
sqldf reduced to a triple join
We can further reduce the above to a triple join which perhaps clarifies the essence of the computation. That is, the three SQL statements above can be reduced to this single statement:
> sqldf("SELECT Site1, Site2, Score, a1.total AS total1, a2.total AS total2
+ FROM df1 AS a1, df1 a2, df2 AS b
+ WHERE a1.Sites = Site1 AND a2.Sites = Site2")
Site1 Site2 Score total1 total2
1 A B 60 12 6
2 A C 70 12 35
3 B C 80 6 35
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4