From @G-Lynn on February 13, 2017 23:31
Attached is a simple example of an issue in using matmul to multiply an array of matrices in R. I think the issue is a discrepancy in the way that the shape argument works in the R and python version.
I am trying to use the tf$matmul function to multiply an array of matrices by an array of vectors. As an example, I am trying to multiply [1, 2; 3, 4] * [1; 2] and [5,6; 7,8] * [3; 4]. The example works as expected in Python, but in the TensorFlow API for R, a dimension error is generated.
In Python, the code for the example is:
############# Beginning of Python code
import tensorflow as tf
a = tf.constant(np.arange(1, 9, dtype=np.int32),shape=[2, 2, 2]) #create array of 2 matrices each 2x2: (1,2; 3,4) and (5,6; 7,8)
b = tf.constant(np.arange(1, 5, dtype=np.int32),shape=[2, 2, 1]) #multuply each of the matrices by a 2x1 vector (1,2)' and (3,4)'
sess = tf.Session()
sess.run(a)
c = tf.matmul(a, b)
sess.run(a)
sess.run(b)
sess.run(c) #the answer is the correct set of 2x1 vectors (5,11)' and (39,53)'
############# End of Python Code
When I try to implement this same example in R, an error is produced due to a difference in the way the dimensions of the arrays are indexed in the shape argument.
##################### Begin R Code
rm(list = ls())
devtools::install_github("rstudio/tensorflow")
library(tensorflow)
#Create an array of 3 matrices
A = list(matrix(1:4, nrow=2, byrow=T), matrix(5:8, nrow=2, byrow=T))
A = array(unlist(A), dim=c(2,2,2) ) #2 matrices of dimension 2x2
B = array(1:4, dim = c(2,1,2) ) #2 vectors of 2x1
A_tf = tf$constant(A, dtype="float64", shape=c(2,2,2))
B_tf = tf$constant(B, dtype="float64", shape=c(2,1,2))
sess = tf$Session()
sess$run(A_tf)
sess$run(B_tf)
sess$run(tf$matmul(A_tf,B_tf))
################## End R code
I believe the error is because of a discrepancy in the way that the shape argument works in tf$constant (R) and the way the shape argument works in tf.constant (R). In R, the number of elements in the array is the last argument in shape so that shape = c(2,1,3) means an array with 3 2x1 vectors. In the python implementation, the number of array elements is the first argument so that shape=[3, 2, 1] means 3 vectors of 2x1.
When the function tf$matmul(A_tf,B_tf) is called, I think the difference in indexing the shapes of the array is causing an error.
Thanks for your attention.
Copied from original issue: rstudio/tensorflow#88
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4